The Blunt Guide to Mathematically Rigorous Machine Learning

Harsh Sikka
Aug 18, 2018 · 4 min read

I recently wrote a brief guide on the People liked it, and asked me to write one on how to master ML at a mathematically rigorous, conceptual level. That is the focus of this guide, no bullshit, no easy routes, and real, fundamental understanding. I’ll be going through the later part of the curriculum myself.

A quick question to ask yourself: Why do I want to learn ML? The following material can be very difficult at times, and keeping discipline is often a matter of keeping your core motivation at heart. For example, I’m trying to validate a new brain inspired theoretical neural network architecture, and to be able to reason about it effectively, I need to have a deep intuition about current architectures and their underlying mathematics.

I won’t be going through the math portions again, you can check out my other article or this on the topic. My advice, learn enough Linear Algebra, Stats, Probability, and Multivariate Calculus to feel good about yourself, and learn everything else as you have to.

1.

Prioritize Chapters 1–4 and Chapters 7–8. This covers supervised learning, linear regression, classification, Model Assessment and Inference. Its okay if you don’t understand it at first, absolutely nobody does. Keep reading it and learning whatever math you need to until you get it. If you want, knock the whole book out, you won’t regret it.

If Elements is really just too hard, you can start with . The book sacrifices some mathematical explanation and focuses on a subset of the problems in Elements, but is a good ramping up point to understanding the material.

Both books focus on R, which is worth learning.

2.

Once you’ve finished Elements, you’re in a great position to take Stanford’s ML course, taught by Andrew Ng. You can think about this like the mathematically rigorous version of his popular Coursera course. Going into this course, make sure to refresh your Multivariate Calculus and Linear Algebra skills, as well as some probability. They provide some handy refresher guides on the site page.

Do all the exercises and problem sets, and try doing the programming assignments in both R and Python. You’ll thank me later.

You can again opt to go for a slightly easier route in , which is focused more on implementation and less on underlying theory and the math. I would really just do all the programming assignments from there as well. You don’t have to do them in Octave/Matlab, you can do R and Python versions. There are plenty of repos to compare to on Github.

3.

At this point, you’re starting to get formidable. You have a fundamental mathematical understanding of many popular, historic techniques in Machine Learning, and can choose to dive into any vertical you want. Of course, most people want to go into Deep Learning because of its significance in industry.

Go through the . It will refresh you on a lot of math and also fundamentally explain much of modern Deep Learning well. You can start messing around with implementations by spinning up a Linux box and doing cool shit with CNNs, RNNs and regular old feed forward neural networks. Use Tensorflow and Pytorch, and start to get a sense of how awesome some of these libraries are for abstracting a lot of the complexity you learned.

I’ve also heard the and co are worth it. They are not nearly as comprehensive as the textbook by Goodfellow et.al, but seem to be a useful companion.

4. and

If you’ve made it this far, congratulations, you’re probably in an excellent place to make sense of the latest papers in field. Just go onto Arxiv and Google Scholar and look at both seminal papers and recently papers that are popular. Remember that ML is a fast moving field and the literature changes, so keep checking back in every few months.

If you’re feeling particularly bold or find something cool, try implementing it yourself. The learning process will be invaluable.

5. Padding your resume and getting hired.

Excellent work. You’ve probably reached the point by now that you can get hired at most places and/or get into grad school. If you want to fill out your resume, you can continue to implement new architectures, or even do

If you want to do the latter, but feel that your actual implementation skills aren’t totally up to par, take . They focus on cohesively applying all the shit you’ve learned over the past few months using popular libraries and tooling.

There are a lot of AI residency programs popping up at OpenAI, Google, Facebook, Uber, and a few other places. You are probably a pretty good candidate, give them a shot.

If you get this far, holy shit. Well done. The journey is never over, but you’re in an excellent place and you understand ML as well as many experts. I think.

Oh and those of you just starting, I’m right there with you. Race you to the end ;)


If you enjoyed it, please let me know by clapping or commenting! I’m working on some interesting stuff, including brain inspired neural networks that have adaptive topology. I’ll be updating as I go along.

Technomancy

AI + Biology

Harsh Sikka

Written by

Grad Student at Harvard and Georgia Tech. Artificial Intelligence, Theoretical Neuroscience, Synthetic Biology, and generally cool stuff.

Technomancy

AI + Biology