LATEST POST
Topics: quick reference and extensions to write in LaTeX.
Topics: responsive. True interactivity: zoom in, zoom out, markers, pop-ups, move around, etc.
Topics: classification and clustering methods. Unsupervised techniques. Clustering: k-means, k-nearest neighbours, hierarchical clustering. Supervised techniques: regression, tree, random forests. Training, testing, predicting. Performance measures: Dunn’s index, ROC, AUC, confusion matrix.
Topics: survival analysis. Event history analysis. Failure and churn analysis. Parametric, semiparametric, and nonparametric models: proportional hazards, accelerated failure time, exponential, piecewise exponential, Weibull, lognormal and Cox regression. Customer churn analysis. Censored and truncated data. Limited dependent variable and Tobit models.
Topics: analyze texts (emails) with algorithms. Differentiate spam and nonspam. Custom methods, tree-based methods, and Support Vector Machine. Train, test, and evaluate the methods.
Topics: data mining. Market basket analysis. Understanding consumer behaviour. Association rules or what is behind recommendation systems. data mining. Market basket analysis. Understanding consumer behaviour. Association rules or what is behind recommendation systems. Dimension reduction. Multidimensional scaling. Factorial analysis, Component analysis (principal, simple, multiple). Linear discriminant analysis. Feature selection.
Topics: experimenting with Tableau. Infographic examples.
Topics: introduction to geospatial models. Visualization with maps. Analyze the Australian Football League audience. Spatial autocorrelation. Autoregressive, lag and error models. Spatial logit and probit models. More advanced models.
Topics: data visualization and map mashups. Introduction to spatial analysis. How to add intelligence to maps.
Topics: web scraping (tweets) with an API. Natural Language Processing. Select topics and keywords to capture tweets. Get up-to-the-minute data and measure delays between tweet (tweeting speed). Text mining and word clouds. Compare two topics: assess popularity with the Poisson distribution. Analyze and manipulate text strings.
Topics: basic to advanced statistical methods. Analyze census data (US state population). Infer the population with sampling and bootstrapping. Simulations and Monte Carlos.
Topics: mathematical optimization. The cooling effect of cream in the coffee. Extrapolation and interpolation.
Topics: a series of projects. A website using a simple web framework. Documentation websites using static site generators. A command-line game and an application to be downloaded and installed.
Topics: interactive data visualization and graphics.
Topics: show graphics and maps instead of explanation or simple data tables. Static visualization. Bring opaque data into general understanding. Storytelling with numbers. Present surveys and polling data.
Topics: Model consumer demand (unit sold). Predict trends. Poisson and Negative Binomial distributions for counting discrete events.
Topics: present to a technical and a nontechnical audience. Storytelling. Bring arcane subjects into general use. Use econometrics techniques. Pose hypotheses, set goals, perform analyses and draw conclusions.
Sujets: Sujets: traitement du langage naturel. Construire un corpus de textes. Explorer les statistiques. Visualiser les mots, les fréquences, les mots communs, les mots différents, les bigrammes. Utiliser des nuages, des graphiques à barres et des dendrogrammes.
Topics: natural language processing, sentiment analysis, and topic modeling. Build a corpus of texts (documents or any tweet, email, comment, publication, status, etc.). Download data using APIs. Populate a database. Explore the statistics. Filter and extract regular expressions. Visualize words, frequencies, ngrams. Assess sentiment, draw conclusions, and provide advice.
Topics: multivariate analysis and visual exploration. Clean and format datasets. Pitching velocity, mix, patterns, location in the ball-strike zone. Change by month, by game, by inning. Ball-strike count, early- and late-game situations. Velocity, impact, and contact rate.
Topics: decision trees, k-NN, and random forests. Storytelling and narrative. Data exploration: tables vs Venn diagrams vs visualization. Train and test sets. Confusion Matrix. Folds and cross validation. Pruning and avoiding overfitting.
Topics: logit, probit, loglog and decision trees. Descriptive statistics. Train and test sets. Predictions. Confusion Matrix and ROC. Bank loan portfolio acceptance rate, bad rate, and risk tolerance.