Why You need Math for Machine Learning
And how much you need to do well in Machine Learning
To help me understand you fill out this survey (anonymous)
A while back, I was on Twitter and I saw the following exceptional take on Twitter. This is a very popular idea online, and I had meant to write about this sooner. But thanks to all the insanity happening in the ML research domain, I got side-tracked.
However, now that I have the time I can finally cover this in-depth. In this post, I will cover “Why you absolutely need Math for Machine Learning.” Even if you never get into the more mathematical AI research work like me.
Not only do you need Math for Machine Learning, but not learning at least some Math BEFORE you get into ML will actively hurt you. We will approach this from various perspectives and also cover how much Math you should have before jumping into ML.
Understanding the argument
To understand why this take is bad let’s first start by understanding what the claim is. The variants of this claim range from, “You can start Machine Learning without Math” all the way to “Math is useless, we don’t need it for Machine Learning”. Both are wrong, but the former is much more forgivable than the latter. For the sake of steel-manning the argument, I’ll address the former claim in this article. Once we get through that, you will see why the other variants of this claim are also wrong.
The claim rests on the following supporting arguments. If you think I missed any, let me know and I can make a follow up-
- You can train, test, evaluate, deploy, and even cross-validate models without any Mathematical/Theoretical Foundations. Just copy Jax, Keras, TF, PyTorch, or any other framework. With AutoML solutions (including the ones I’ve built), this process becomes even easier.
- There are enough tutorials on the internet for you to copy and paste. Just search your issue and you will find it.
- Almost no Deep Learning engineer uses Fourier Series, Number Transformations, Calculus, or anything fancy regularly. AI researchers are the only ones that do. If you’re not one of them, you don’t need to worry.
- You can always just start first and pick up the Math later. You don’t need to be an expert when you start off. You can learn Math as you become experienced. This way you will also know what’s more important.
There is some merit to these ideas. Many people struggle with Math, so it makes sense to avoid it if possible. Especially if you can keep making that sweet ML money and building your career.
I will now address these points to show you why thinking along these lines will harm your career. And will cause you a lot of pain.
Model Training + Deployment doesn’t need Math
There are a lot of exceptional frameworks online. They have made the implementation of ML models much easier. We now longer need to worry about annoying little details like backpropagation, activation functions, and batch sizes. Especially if we copy the tutorials and blogs already mentioned. Just throw everything at the wall and see what sticks. Unfortunately, a lot of people I have worked with have thought this is a valid approach.
Unfortunately, the reality is not this pretty. Most of your Machine Learning will be working on your model. You will first need to evaluate your datasets, and figure out the distributions, to see how your features are related to each other. Before applying Machine Learning, you will first need to evaluate if Machine Learning is even valid for your problem.
Does your data violate the IID principle? If you’ve never studied this concept you won’t even know to ask this question. You will go ahead and apply ML because all the blogs, tutorials, and papers implicitly make this assumption so you’ve never seen this questioned. I once interviewed someone very highly ranked on Kaggle. They were very good with clean datasets and areas tasks that had a clear path forward. But because they didn’t know much about Math (and assumptions behind different frameworks), they couldn’t answer this basic question-
Why is it that developers can create confidence intervals with ARIMA as we predict 100 time-steps into the future, but not with more ‘advanced’ ideas like LSTMs?
The answer to this question rests on a fundamental difference between ARIMA and Neural Networks. If you want to take a stab at this problem, go ahead. I’d love to know your answer(hint: it’s not lazy developers).
The truth is that in a field like ML, you will never be an expert. Things will come across that will leave your ideas outdated. Happened to Newtonian Physics. Happened to Euclidean Geometry. ML Papers are always changing the status quo and challenging what we took as truth (such as this paper showing us that many ML tests done on benchmarks had a large source of randomness). To make sure you can keep learning you have to know the foundations. That requires math.
Okay but what about Learning on the job? Surely you can learn Machine Learning as you get started? Take personal projects and learn through them. Surely that is an approach that allows you to get into ML first without the Math. Math can be learned as needed later.
Learning on the Job
Learning on the job is a phenomenal thing. I’ve done it. I even recommend it to others. However, to learn on the job, you have to meet the basic qualifications for the job.
Let’s assume you pick up a nice clean beginner dataset from Kaggle. No need to worry about feature correlations, data dependencies, and other pesky deets. Just a nice and clean analysis. Sklearn go brrrr!!!
In such a case, what have you actually learned? What tangible accomplishments have you gotten? What learning goals have you achieved? Running a few lines of code isn’t really meaningful. How long will you have to do this, before you actually do something that would be marginally useful? I’m not doing this to belittle anyone. I’m simply pointing out that this approach is inefficient.
If you really want to dive in and learn on the job, it is much better to get into a project that has a lot of possible avenues for change and try to test things one at a time. This will help you understand a lot more. Much quicker.
However, as shown earlier, messier projects will come with many more moving parts. Even identifying those moving parts will require a basic knowledge of Math. You can burn time running trivial code till you understand the little things enough to move on. Or you can learn some foundational ideas, and go for bigger things. The choice is yours.
Now to address the last big argument. “Nobody really uses Math day-day in ML (except for a few researchers). Therefore you can start with ML without knowing Math.” Let’s now cover why this is untrue.
Math is everywhere
If you work as an ML Engineer (not a researcher), you will largely have the following responsibilities-
- Evaluating+ Cleaning Datasets, Coming up with Cleaning Policies, Feature Engineering, etc- Most of this is all Math. Creating new features from existing ones requires an analysis of the statistical distributions of the features. Evaluating the datasets for purity and outliers also requires a baseline understanding of Math. Knowing what metrics to evaluate your performance with (very important and always overlooked) will need you to understand the metrics. This also needs (cue the drums)…Math.
- Data Pipeline Maintenance, Checking for leaks, etc- Many smaller organizations will have you handle this part as well. I’ve personally never worked in this area but based on my conversations with people that have worked in this domain, this area requires a lot more software engineering than Machine Learning. You can get away with not knowing much math here. However, to make any transitions into ML, you will need Math.
- Model Testing, Creating training policies etc- This is what people typically think of when they think of Machine Learning. Contrary to what is peddled online, the creation and training of models is the easy part. You need to be able to read the training logs and outputs (which you will have to create FYI) to identify different protocols that might work. Often, you will have to evaluate performance on custom performance metrics (based on what the client/employer values). Metrics that you will need to create, modify and understand. How do you plan to do that?
Look at it from an employer’s perspective. If all you know is how to copy-paste and take other people’s models, why would they hire you? Sure you can try to transition from Data Engineering into ML. Many people do that. But this process is slow. And you will have to compete against people with actual ML experience. You’re just making life harder for yourself. Tech Interviews are hard enough as is.
It should be clear that learning Math before diving into ML will save you a lot more time in the long run. So what Math should you learn before Machine Learning? And how much? Let’s cover that now.
Math Pre-Requisites for Machine Learning
To get right into it, learning these concepts will help you a ton-
- Pre+Calculus- You want to get through Integration and Taylor Series at least. This will allow you to understand the Probability stuff, backpropagation, and other foundational ideas.
- Probs and Stats- Get through covariance, know different distributions, and get a baseline understanding of Bayesian thinking is important. Beat the Central Limit Theorem into your soul.
- Linear Algebra- Understand Matrice Multiplication and change of basis. That’s enough as a base.
For a more thorough, step-by-step guide to learning Machine Learning, check out my article How to learn Machine Learning in 2022. That will share more details about these topics (and more) and point to the free resources you can use to learn these topics.
To end, I would suggest watching the following video by 3Blue1Brown. It doesn’t have much to do with Machine Learning. However, it is a perfect example of what makes Math so beautiful (and hard). I won’t spoil the video for you. Don’t miss out on it.
I’m going to end the article here. Let me know your thoughts on the matter. What is your relationship with Math and Machine Learning? Do you struggle to self-study higher level Math ideas? If so, make sure you follow me here because I will cover my way of teaching myself Math very soon.
For Machine Learning a base in Software Engineering, Math, and Computer Science is crucial. It will help you conceptualize, build, and optimize your ML. My daily newsletter, Coding Interviews Made Simple covers topics in Algorithm Design, Math, Recent Events in Tech, Software Engineering, and much more to make you a better developer. I am currently running a 20% discount for a WHOLE YEAR, so make sure to check it out.
I created Coding Interviews Made Simple using new techniques discovered through tutoring multiple people into top tech firms. The newsletter is designed to help you succeed, saving you from hours wasted on the Leetcode grind. I have a 100% satisfaction policy, so you can try it out at no risk to you. You can read the FAQs and find out more here
Feel free to reach out if you have any interesting jobs/projects/ideas for me as well. Always happy to hear you out.
For monetary support of my work following are my Venmo and Paypal. Any amount is appreciated and helps a lot. Donations unlock exclusive content such as paper analysis, special code, consultations, and specific coaching:
Reach out to me
Use the links below to check out my other content, learn more about tutoring, or just to say hi. Also, check out the free Robinhood referral link. We both get a free stock (you don’t have to put any money), and there is no risk to you. So not using it is just losing free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
If you’re preparing for coding/technical interviews: https://codinginterviewsmadesimple.substack.com/
Get a free stock on Robinhood: https://join.robinhood.com/fnud75