The Math Required for Machine Learning

Harsh Sikka
Aug 17, 2018 · 2 min read

For the past year, I’ve been working on implementing well known model architectures and building web applications, so I have a fair amount of refreshing to do when coming back to theoretical machine learning. A lot of it has to do with understanding machine learning’s underlying mathematics rigorously, to be able to reason with the field and validate radically new architectures. To that end, I’ve put together a short syllabus that I’ll be personally going through to review some Math

Keep in mind there are a lot of excellent resources out there. I’ll no doubt be updating with a better guide as I work through this material over the next few weeks.

Resources to Study Math

Having a fundamental understanding of mathematics is absolutely necessary to being able to reason with ML productively.

That being said, I’m of the stance that you can learn what you need to as you go along, so I’d recommend getting a basic familiarity through the Mathematics for Machine Learning Specialization on Coursera. Its pleasantly tough, and gets you to where you need to go fast.

If starting from complete scratch, the topics you should certainly review/cover, in any order are as follows:

  1. Linear Algebra — Professor Strang’s textbook and MIT Open Courseware course are recommended for good reason. Khan Academy also has some great resources, and there is a helpful set of review notes from Stanford.
  2. Multivariate Calculus — Again, MIT Open Courseware has good courses, and so does Khan Academy.
  3. Probability — Stanford’s CS 229, a course I’ve mentioned later, has an awesome probability review worth checking out.

Once you’ve finished the resources above, I’d say you’re in a great place to tackle the Andrew Ng Coursera Course or its more mature, mathematically rigorous older brother, CS 229.


Recently, I’ve been working on figuring out how to answer a nebulous question that formed in my mind as I work on my thesis: Are modular neural networks with adaptive topology capable of representing large, complex, hierarchical problem spaces effectively?

I’d be hard pressed to call it a research question, since its such a broad topic, but my intuition keeps serving up the answer yes. I’ll be writing more about this in a later post, and explain why I’m particularly excited about it.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store