What’s the best way to start learning machine learning in Python?

Credit: scikit-learn.org

For getting started with machine learning as a subject, please take a look at:

Aman Dalmia’s answer to What should I do if I want to work in artificial intelligence when I am older?

For starting on projects in ML, this might be useful:

Aman Dalmia’s answer to How can I begin with projects of machine learning and artificial intelligence?

For starting with implementation of Machine learning algorithms in Python:

To get started, it would be best to follow Udacity’s Intro To Machine Learning course. It is taught by none other than Sebastian Thrun (Founder of Udacity) himself. They use scikit-learn for the programming exercises and show you how to use the package for solving various problems for each algorithm (regression, naive bayes, decision trees, SVM, etc.) along with practical considerations for solving real world problems. Why I would prefer this over Andrew Ng’s course that you might have heard of, is purely from the implementation point of view. The programming exercises there are in Octave / Matlab which is rarely used nowadays.

The course would help you get started with machine learning in python. If you are motivated after that, you can check out scikit-learn’s User Guide. It presents all the various algorithms that the package includes along with code samples which help you visualize them greatly. Also, if you are looking to get into the field, visualizing your data is going to be an important skill, giving you one reason to devote your time to this.

Important algorithms to understand (from the user guide above) would be:

(P -> more practically used, VI -> very important)

Supervised Learning

  • Ordinary Least Squares, Ridge regression, Lasso, Logistic regression (VI), Stochastic Gradient Descent (SGD)(VIP), Perceptron, SVM (VIP), Nearest neighbors (NN), Naive Bayes (VI to understand the math behind ML), Decison Trees (VIP), Ensemble methods — Random Forests (VIP), Adaboost, Boosting, Neural Networks (VIP) (this is where the spark in ML currently is).

Unsupervised Learning

  • Gaussian Mixture Models (GMM), K-means clustering, Dimensionality Reduction — PCA (VIP), LDA, Neural Networks (VIP) (Yes, here too ;))

For Neural Networks specifically, there are a lot of resources out there but recently Andrew Ng recently a series of 5 courses as a part of deeplearning.ai on Coursera. I have done those sources and I would definitely recommend them as by far the most useful to any person willing to work in the field of AI.

Hope this helped :)


Originally published at www.quora.com.