A-Z Machine Learning Resources
Published in
11 min readJan 9, 2020
Table Of Contents
- General Stuff
- Interview Resources
- Artificial Intelligence
- Genetic Algorithms
- Statistics
- Useful Blogs
- Resources on Quora
- Resources on Kaggle
- Cheat Sheets
- Classification
- Linear Regression
- Logistic Regression
- Model Validation using Resampling
- Cross Validation
- Bootstrapping
- Deep Learning
- Frameworks
- Feed Forward Networks
- Recurrent Neural Nets, LSTM, GRU
- Restricted Boltzmann Machine, DBNs
- Autoencoders
- Convolution Neural Nets
- Natural Language Processing
- Topic Modeling, LDA
- Word2Vec
- Computer Vision
- Support Vector Machine
- Reinforcement Learning
- Decision Trees
- Random Forest / Bagging
- Boosting
- Ensembles
- Stacking Models
- VC Dimension
- Bayesian Machine Learning
- Semi Supervised Learning
- Optimizations
General Stuff
- A curated list of awesome Machine Learning frameworks, libraries and software
- A curated list of awesome data visualization libraries and resources.
- An awesome Data Science repository to learn and apply for real world problems
- The Open Source Data Science Masters
- Machine Learning FAQs on Cross Validated
- List of Machine Learning University Courses
- Machine Learning algorithms that you should always have a strong understanding of
- Differnce between Linearly Independent, Orthogonal, and Uncorrelated Variables
- List of Machine Learning Concepts
- Slides on Several Machine Learning Topics
- MIT Machine Learning Lecture Slides
- Comparison Supervised Learning Algorithms
- Learning Data Science Fundamentals
- Machine Learning mistakes to avoid
- Statistical Machine Learning Course
- TheAnalyticsEdge edX Notes and Codes
Interview Resources
- How can a computer science graduate student prepare himself for data scientist interviews?
- How do I learn Machine Learning?
- FAQs about Data Science Interviews
- What are the key skills of a data scientist?
Artificial Intelligence
- Awesome Artificial Intelligence (GitHub Repo)
- edX course | Klein & Abbeel
- Udacity Course | Norvig & Thrun
- TED talks on AI
Genetic Algorithms
- Genetic Algorithms Wikipedia Page
- Simple Implementation of Genetic Algorithms in Python (Part 1), Part 2
- Genetic Algorithms vs Artificial Neural Networks
- Genetic Algorithms Explained in Plain English
- Genetic Programming
- Genetic Programming in Python (GitHub)
- Genetic Alogorithms vs Genetic Programming (Quora), StackOverflow
Statistics
- Stat Trek Website — A dedicated website to teach yourselves Statistics
- Learn Statistics Using Python — Learn Statistics using an application-centric programming approach
- Statistics for Hackers | Slides | @jakevdp — Slides by Jake VanderPlas
- Online Statistics Book — An Interactive Multimedia Course for Studying Statistics
- What is a Sampling Distribution?
- Tutorials
- AP Statistics Tutorial
- Statistics and Probability Tutorial
- Matrix Algebra Tutorial
- What is an Unbiased Estimator?
- Goodness of Fit Explained
- What are QQ Plots?
Useful Blogs
- Edwin Chen’s Blog — A blog about Math, stats, ML, crowdsourcing, data science
- The Data School Blog — Data science for beginners!
- ML Wave — A blog for Learning Machine Learning
- Andrej Karpathy — A blog about Deep Learning and Data Science in general
- Colah’s Blog — Awesome Neural Networks Blog
- Alex Minnaar’s Blog — A blog about Machine Learning and Software Engineering
- Statistically Significant — Andrew Landgraf’s Data Science Blog
- Simply Statistics — A blog by three biostatistics professors
- Yanir Seroussi’s Blog — A blog about Data Science and beyond
- fastML — Machine learning made easy
- Trevor Stephens Blog — Trevor Stephens Personal Page
- no free hunch | kaggle — The Kaggle Blog about all things Data Science
- A Quantitative Journey | outlace — learning quantitative applications
- r4stats — analyze the world of data science, and to help people learn to use R
- Variance Explained — David Robinson’s Blog
- AI Junkie — a blog about Artificial Intellingence
Resources on Quora
- Most Viewed Machine Learning writers
- Data Science Topic on Quora
- William Chen’s Answers
- Michael Hochster’s Answers
- Ricardo Vladimiro’s Answers
- Storytelling with Statistics
- Data Science FAQs on Quora
- Machine Learning FAQs on Quora
Kaggle Competitions WriteUp
- How to almost win Kaggle Competitions
- Convolution Neural Networks for EEG detection
- Facebook Recruiting III Explained
- Predicting CTR with Online ML
Cheat Sheets
Classification
- Does Balancing Classes Improve Classifier Performance?
- What is Deviance?
- When to choose which machine learning classifier?
- What are the advantages of different classification algorithms?
- ROC and AUC Explained
- An introduction to ROC analysis
- Simple guide to confusion matrix terminology
Linear Regression
- General
- Assumptions of Linear Regression, Stack Exchange
- Linear Regression Comprehensive Resource
- Applying and Interpreting Linear Regression
- What does having constant variance in a linear regression model mean?
- Difference between linear regression on y with x and x with y
- Is linear regression valid when the dependant variable is not normally distributed?
- Multicollinearity and VIF
- Dummy Variable Trap | Multicollinearity
- Dealing with multicollinearity using VIFs
- Residual Analysis
- Interpreting plot.lm() in R
- How to interpret a QQ plot?
- Interpreting Residuals vs Fitted Plot
- Outliers
- How should outliers be dealt with?
- Elastic Net
- Regularization and Variable Selection via the Elastic Net
Follow MLAIT
Logistic Regression
- Logistic Regression Wiki
- Geometric Intuition of Logistic Regression
- Obtaining predicted categories (choosing threshold)
- Residuals in logistic regression
- Difference between logit and probit models, Logistic Regression Wiki, Probit Model Wiki
- Pseudo R2 for Logistic Regression, How to calculate, Other Details
Model Validation using Resampling
- Resampling Explained
- Partioning data set in R
- Implementing hold-out Validaion in R, 2
- Cross Validation
- Training with Full dataset after CV?
- Which CV method is best?
- Variance Estimates in k-fold CV
- Is CV a subsitute for Validation Set?
- Choice of k in k-fold CV
- CV for ensemble learning
- k-fold CV in R
- Good Resources
- Overfitting and Cross Validation
- Preventing Overfitting the Cross Validation Data | Andrew Ng
- Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation
- CV for detecting and preventing Overfitting
- How does CV overcome the Overfitting Problem
- Bootstrapping
- Why Bootstrapping Works?
- Good Animation
- Example of Bootstapping
- Understanding Bootstapping for Validation and Model Selection
- Cross Validation vs Bootstrap to estimate prediction error, Cross-validation vs .632 bootstrapping to evaluate classification performance
Follow MLAIT
Deep Learning
- A curated list of awesome Deep Learning tutorials, projects and communities
- Lots of Deep Learning Resources
- Interesting Deep Learning and NLP Projects (Stanford), Website
- Core Concepts of Deep Learning
- Understanding Natural Language with Deep Neural Networks Using Torch
- Stanford Deep Learning Tutorial
- Deep Learning FAQs on Quora
- Google+ Deep Learning Page
- Recent Reddit AMAs related to Deep Learning, Another AMA
- Where to Learn Deep Learning?
- Deep Learning nvidia concepts
- Introduction to Deep Learning Using Python (GitHub), Good Introduction Slides
- Video Lectures Oxford 2015, Video Lectures Summer School Montreal
- Deep Learning Software List
- Hacker’s guide to Neural Nets
- Top arxiv Deep Learning Papers explained
- Geoff Hinton Youtube Vidoes on Deep Learning
- Awesome Deep Learning Reading List
- Deep Learning Comprehensive Website, Software
- deeplearning Tutorials
- AWESOME! Deep Learning Tutorial
- Deep Learning Basics
- Stanford Tutorials
- Train, Validation & Test in Artificial Neural Networks
- Artificial Neural Networks Tutorials
- Neural Networks FAQs on Stack Overflow
- Deep Learning Tutorials on deeplearning.net
- Neural Machine Translation
- Introduction to Neural Machine Translation with GPUs (part 1), Part 2, Part 3
- Deep Speech: Accurate Speech Recognition with GPU-Accelerated Deep Learning
- Deep Learning Frameworks
- Torch vs. Theano
- dl4j vs. torch7 vs. theano
- Deep Learning Libraries by Language
- Theano
- Website
- Theano Introduction
- Theano Tutorial
- Good Theano Tutorial
- Logistic Regression using Theano for classifying digits
- MLP using Theano
- CNN using Theano
- RNNs using Theano
- LSTM for Sentiment Analysis in Theano
- RBM using Theano
- DBNs using Theano
- All Codes
- Torch
- Torch ML Tutorial, Code
- Intro to Torch
- Learning Torch GitHub Repo
- Awesome-Torch (Repository on GitHub)
- Machine Learning using Torch Oxford Univ, Code
- Torch Internals Overview
- Torch Cheatsheet
- Understanding Natural Language with Deep Neural Networks Using Torch
- Caffe
- Deep Learning for Computer Vision with Caffe and cuDNN
- TensorFlow
- Website
- Learning TensorFlow GitHub Repo
- Benchmark TensorFlow GitHub
- Feed Forward Networks
- Implementing a Neural Network from scratch, Code
- Speeding up your Neural Network with Theano and the gpu, Code
- Basic ANN Theory
- Role of Bias in Neural Networks
- Choosing number of hidden layers and nodes,2,3
- Backpropagation Explained
- ANN implemented in C++ | AI Junkie
- Simple Implementation
- NN for Beginners
- Regression and Classification with NNs (Slides)
- Another Intro
- Recurrent and LSTM Networks
- awesome-rnn: list of resources (GitHub Repo)
- Recurrent Neural Net Tutorial Part 1, Part 2, Part 3, Code
- NLP RNN Representations
- The Unreasonable effectiveness of RNNs, Torch Code, Python Code
- Intro to RNN, LSTM
- An application of RNN
- Optimizing RNN Performance
- Simple RNN
- Auto-Generating Clickbait with RNN
- Sequence Learning using RNN (Slides)
- Machine Translation using RNN (Paper)
- Music generation using RNNs (Keras)
- Using RNN to create on-the-fly dialogue (Keras)
- Long Short Term Memory (LSTM)
- Understanding LSTM Networks
- LSTM explained
- LSTM
- Implementing LSTM from scratch, Python/Theano code
- Torch Code, Torch
- LSTM for Sentiment Analysis in Theano
- Deep Learning for Visual Q&A | LSTM | CNN, Code
- Computer Responds to email | Google
- LSTM dramatically improves Google Voice Search, 2
- Understanding Natural Language with Deep Neural Networks Using Torch
- Gated Recurrent Units (GRU)
- LSTM vs GRU
- Recursive Neural Network (not Recurrent)
- Recursive Neural Tensor Network (RNTN)
- word2vec, DBN, RNTN for Sentiment Analysis
- Restricted Boltzmann Machine
- Beginner’s Guide about RBMs
- Another Good Tutorial
- Introduction to RBMs
- Hinton’s Guide to Training RBMs
- RBMs in R
- Deep Belief Networks Tutorial
- word2vec, DBN, RNTN for Sentiment Analysis
- Autoencoders: Unsupervised (applies BackProp after setting target = input)
- Andrew Ng Sparse Autoencoders pdf
- Deep Autoencoders Tutorial
- Denoising Autoencoders, Theano Code
- Stacked Denoising Autoencoders
- Convolution Networks
- Awesome Deep Vision: List of Resources (GitHub)
- Intro to CNNs
- Understanding CNN for NLP
- Stanford Notes, Codes, GitHub
- JavaScript Library (Browser Based) for CNNs
- Using CNNs to detect facial keypoints
- Deep learning to classify business photos at Yelp
- Interview with Yann LeCun | Kaggle
- Visualising and Understanding CNNs
Follow MLAIT
Natural Language Processing
- A curated list of speech and natural language processing resources
- Understanding Natural Language with Deep Neural Networks Using Torch
- tf-idf explained
- Interesting Deep Learning NLP Projects Stanford, Website
- NLP from Scratch | Google Paper
- Graph Based Semi Supervised Learning for NLP
- Bag of Words
- Classification text with Bag of Words
- Topic Modeling
- LDA, LSA, Probabilistic LSA
- Awesome LDA Explanation!. Another good explanation
- The LDA Buffet- Intuitive Explanation
- Difference between LSI and LDA
- Original LDA Paper
- alpha and beta in LDA
- Intuitive explanation of the Dirichlet distribution
- Topic modeling made just simple enough
- Online LDA, Online LDA with Spark
- LDA in Scala, Part 2
- Segmentation of Twitter Timelines via Topic Modeling
- Topic Modeling of Twitter Followers
- word2vec
- Google word2vec
- Bag of Words Model Wiki
- A closer look at Skip Gram Modeling
- Skip Gram Model Tutorial, CBoW Model
- Word Vectors Kaggle Tutorial Python, Part 2
- Making sense of word2vec
- word2vec explained on deeplearning4j
- Quora word2vec
- Other Quora Resources, 2, 3
- word2vec, DBN, RNTN for Sentiment Analysis
- Text Clustering
- How string clustering works
- Levenshtein distance for measuring the difference between two sequences
- Text clustering with Levenshtein distances
- Text Classification
- Classification Text with Bag of Words
- Language learning with NLP and reinforcement learning
- Kaggle Tutorial Bag of Words and Word vectors, Part 2, Part 3
- What would Shakespeare say (NLP Tutorial)
- A closer look at Skip Gram Modeling
Computer Vision
Support Vector Machine
- Highest Voted Questions about SVMs on Cross Validated
- Help me Understand SVMs!
- SVM in Layman’s terms
- How does SVM Work | Comparisons
- A tutorial on SVMs
- Practical Guide to SVC, Slides
- Introductory Overview of SVMs
- Comparisons
- SVMs > ANNs, ANNs > SVMs, Another Comparison
- Trees > SVMs
- Kernel Logistic Regression vs SVM
- Logistic Regression vs SVM, 2, 3
- Optimization Algorithms in Support Vector Machines
- Variable Importance from SVM
- Software
- LIBSVM
- Intro to SVM in R
- Kernels
- What are Kernels in ML and SVM?
- Intuition Behind Gaussian Kernel in SVMs?
- Probabilities post SVM
- Platt’s Probabilistic Outputs for SVM
- Platt Calibration Wiki
- Why use Platts Scaling
- Classifier Classification with Platt’s Scaling
Reinforcement Learning
Decision Trees
- Wikipedia Page — Lots of Good Info
- FAQs about Decision Trees
- Brief Tour of Trees and Forests
- Tree Based Models in R
- How Decision Trees work?
- Weak side of Decision Trees
- Thorough Explanation and different algorithms
- What is entropy and information gain in the context of building decision trees?
- Slides Related to Decision Trees
- How do decision tree learning algorithms deal with missing values?
- Using Surrogates to Improve Datasets with Missing Values
- Good Article
- Are decision trees almost always binary trees?
- Pruning Decision Trees, Grafting of Decision Trees
- What is Deviance in context of Decision Trees?
- Comparison of Different Algorithms
- CART vs CTREE
- Comparison of complexity or performance
- CHAID vs CART , CART vs CHAID
- Good Article on comparison
- CART
- Recursive Partitioning Wikipedia
- CART Explained
- How to measure/rank “variable importance” when using CART?
- Pruning a Tree in R
- Does rpart use multivariate splits by default?
- FAQs about Recursive Partitioning
- CTREE
- party package in R
- Show volumne in each node using ctree in R
- How to extract tree structure from ctree function?
- CHAID
- Wikipedia Artice on CHAID
- Basic Introduction to CHAID
- Good Tutorial on CHAID
- MARS
- Wikipedia Article on MARS
- Probabilistic Decision Trees
- Bayesian Learning in Probabilistic Decision Trees
- Probabilistic Trees Research Paper
Random Forest / Bagging
- Awesome Random Forest (GitHub)**
- How to tune RF parameters in practice?
- Measures of variable importance in random forests
- Compare R-squared from two different Random Forest models
- OOB Estimate Explained | RF vs LDA
- Evaluating Random Forests for Survival Analysis Using Prediction Error Curve
- Why doesn’t Random Forest handle missing values in predictors?
- How to build random forests in R with missing (NA) values?
- FAQs about Random Forest, More FAQs
- Obtaining knowledge from a random forest
- Some Questions for R implementation, 2, 3
Boosting
- Boosting for Better Predictions
- Boosting Wikipedia Page
- Introduction to Boosted Trees | Tianqi Chen
- Gradient Boosting Machine
- Gradiet Boosting Wiki
- Guidelines for GBM parameters in R, Strategy to set parameters
- Meaning of Interaction Depth, 2
- Role of n.minobsinnode parameter of GBM in R
- GBM in R
- FAQs about GBM
- GBM vs xgboost
- xgboost
- xgboost tuning kaggle
- xgboost vs gbm
- xgboost survey
- AdaBoost
- AdaBoost Wiki, Python Code
- AdaBoost Sparse Input Support
- adaBag R package
- Tutorial
Ensembles
- Wikipedia Article on Ensemble Learning
- Kaggle Ensembling Guide
- The Power of Simple Ensembles
- Ensemble Learning Intro
- Ensemble Learning Paper
- Ensembling models with R, Ensembling Regression Models in R, Intro to Ensembles in R
- Ensembling Models with caret
- Bagging vs Boosting vs Stacking
- Good Resources | Kaggle Africa Soil Property Prediction
- Boosting vs Bagging
- Resources for learning how to implement ensemble methods
- How are classifications merged in an ensemble classifier?
Follow MLAIT
Stacking Models
- Stacking, Blending and Stacked Generalization
- Stacked Generalization (Stacking)
- Stacked Generalization: when does it work?
- Stacked Generalization Paper
Vapnik–Chervonenkis Dimension
- Wikipedia article on VC Dimension
- Intuitive Explanantion of VC Dimension
- Video explaining VC Dimension
- Introduction to VC Dimension
- FAQs about VC Dimension
- Do ensemble techniques increase VC-dimension?
Bayesian Machine Learning
- Bayesian Methods for Hackers (using pyMC)
- Should all Machine Learning be Bayesian?
- Tutorial on Bayesian Optimisation for Machine Learning
- Bayesian Reasoning and Deep Learning, Slides
- Bayesian Statistics Made Simple
- Kalman & Bayesian Filters in Python
- Markov Chain Wikipedia Page
Semi Supervised Learning
- Wikipedia article on Semi Supervised Learning
- Tutorial on Semi Supervised Learning
- Graph Based Semi Supervised Learning for NLP
- Taxonomy
- Video Tutorial Weka
- Unsupervised, Supervised and Semi Supervised learning
- Research Papers 1, 2, 3
Optimization
- Mean Variance Portfolio Optimization with R and Quadratic Programming
- Algorithms for Sparse Optimization and Machine Learning
- Optimization Algorithms in Machine Learning, Video Lecture
- Optimization Algorithms for Data Analysis
- Video Lectures on Optimization
- Optimization Algorithms in Support Vector Machines
- The Interplay of Optimization and Machine Learning Research