How to learn Machine Learning?

Roman Kierzkowski
Machine Intelligence Report
4 min readMar 13, 2016

Some time ago I started a journey into one of the most exciting fields in Computer Science — Machine Learning. This is my subjective guide for anyone who would like to explore this topic, but don’t know how to start.

Before You Go

It’s time to pack our bag:

  • math is a must — it will be hard to start if you don’t know a glimpse of algebra and calculus. You don’t have to be an expert, but you must know what a minimum of a function is and understand that math can be done on symbols.
  • programming — some programming experience is necessary. Machine Learning is not a place to take baby steps in programming. If you cannot code, take one of many Programming 101 courses. Python is a good language of choice.

First Steps

Your first steps should lead to Stanford Machine Learning class at Coursera by Andrew Ng. This course is simply brilliant! Along a way, you will be given everything you need to know, including algebra review. You will not only get the overview of supervised and unsupervised techniques but you will gather hands on experience through coding assignments in Octave (or Matlab). Pull in your weight and you will gain a lot of self-confidence. You will probably reach peak in Dunning-Kruger Effect.

As pointed out by Piotr Migdal this is real Dunning-Krugger effect.

Try Yourself Out

Try yourself out in one of ML competitions on Kaggle. You probably will not win, but you will get an idea what you still need to learn.

Walking Shoes

It’s time to buy better shoes. There are plenty of options, but two stands out: Python and R.

This two have strong and opinionated believers. Among people I met, those with math background prefer R and those with roots in CS get along with Python. I would recommend to start with Python. It is beautiful language and you will get faster to exciting stuff. The most popular ML library for Python is scikit-learn. To make your work convenient try out interactive environments: ipython is better console for python and Jupyter is programming notebook which works also with R. Data processing can be done with PANDAS.

With R you get caret package, RStudio, knitr and Shiny as counter parts.

Deep Learning

Admit it. You still seek fame. You have heard about deep learning and you want jump on a bandwagon. You are lucky! Google prepared great course on Deep Learning on Udacity. You will learn everything you need to know about neural networks with hands on experience using TensorFlow library.

It Is All About Data

Is it is getting to you that whole Machine Learning fuss is all about data? It is time to learn R and become Data Scientist with John Hopkins specialization on Coursera. It is 9 classes that will teach you data scientist toolkit, data exploration and visualization, statistical inference and regression. Everything is wrapped up with the capstone project.

It Is Just Statistics

Practical Machine Learning from Data Science specialization probably opened your eyes that there is more into ML than Andrew Ng taught you. Get deep into the field of Statistical Learning with Stanford course by Rob Tibshirani and Trevor Hastie. These two and Jerome Friedman are the authors of ML Bible Elements of Statistical Learning which is available online for free.

Rob Tibshirani and Trevor Hastie. (Take a class, you will get it.)

You may feel that you need to supplement your knowledge of probability. There is a great class from MIT Introduction to Probability — The Science of Uncertainty on edX. I would also recommend a great book: Introduction to Probability by course instructor John N. Tsitsiklis.

Big Data

It was proved that more data beats better algorithm. How to handle huge datasets?

Graphic cards are able to do matrix calculations effectively and thats what we need in ML. To speed up calculations you can use libraries like OpenCL or Theano, but usually you can use frameworks like Keras that will do it for you.

You can also to make computations parallel using map reduce paradigm. You can use Hadoop for that, but nowadays it becomes overshadowed by Spark which comes with own ML library MLib. If you want to learn more, take two great classes at Berkley via edX: Introduction to Big Data with Apache Spark and Scalable Machine Learning.

Search Your Own Path

It’s time for you to search your own path. For fresh news in the field pay regular visits to ML Reddit, DataTau. It is also good idea to join Facebook groups like R-Cran Fan Club! Sign up to different MOOC platforms. Interesting courses keeps popping out.

Good luck with your journey!

Want to share your thoughts with me? Find me on Twitter.

Looking for machine learning specialist? Contact me via DataCentric.

--

--