Published in

Paper shows why you will struggle at Machine Learning

This is easily the most technically complex paper I’ve ever read.

To help me understand you fill out this survey (anonymous)

I’ve been going Multiplying Matrices without Matrices (link: And it’s a paper I have spent a lot of time on. How can I not? The abstract claims, “Experiments using hundreds of matrices from diverse domains show that it often runs 100× faster than exact matrix products and 10× faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds.” If you understand machine learning, this has huge implications for the learning process.

This sentiment is not uncommon. If you feel this way, you are not alone

At the same time, I came across the above tweet on my timeline. And I can definitely see where this is coming from. Meaningful ML is by its nature multi-disciplinary. While the code for an LSTM and Random Forests stay the same, the context around the problem changes. Depending on what you’re working on, the way you get, prepare, clean, and evaluate your data changes. Thus you will end up needing to become proficient at multiple things. This process involves a lot of Googling and can be very frustrating/disheartening.

The paper is a rather extreme example of that. I double major in Math and Computer Science. Selected my courses to get good at Coding and AI/ML in particular. So I’m well suited to understanding the details. But even after a month, a lot of this paper is very challenging.

Me trying to understand the paper.

In this article, I will use the paper as an example of why good Machine Learning is difficult. I will explain why that’s a good thing for you, and what you can do to benefit from this. If nothing else, I hope that by the end of this article you understand what it takes to get to a high level at ML.

Understanding the Implications of this paper

A quick word on why this paper is greatness. In machine learning data points are represented as multi-dimensional matrices. Multiplying matrices is very important for a lot of functions. It is also notoriously difficult. To those interested, this article by Quanta is pretty good to understand.

Don’t underestimate pre-processing.

This is where the paper gets insane. “In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds.” When might we see such cases? Imagine our model has the weights and just needs to compute the predictions based on input. The weights are a matrix we know which will be multiplied with the input matrix. Given how much this process happens, your savings will really add up.

This is one example of a great application of matrix multiplacation.

Why this paper is a nightmare to understand.

So now that we have some idea of why this concept is important let’s talk about why this paper is challenging. Simply put, it traverses a lot of technical fields. Here’s a depiction of the Product Quantization they use:

Not only is it using Vectors, but it also relies on prototype learning, hashing, and aggregation. This would require very good coding and mathematical skills. Even their hashing is far from basic. The authors rely on hashing trees, which can be terryfing. Check out section 4.1 for more details. The complexity and wide-ranging nature of the paper was best articulated by the authors as “our work draws on a number of different fields but does not fit cleanly into any of them”. Developing your understanding of the basics will help you at least understand the assumptions and experiment setups.

For a detailed look at some of the assumptions in the paper, check out this video. I go over the assumptions, a concrete example of the matrix multiplication approximation. Make sure to pause the video and read the snippets I’ve taken from the paper. I found them particularly insightful.

Why this complexity is a Good Thing for you

Obviously not every Machine Learning/AI venture is as complex as this paper. However, real-life ML will be complex. Following is an exchange I had with someone who read and enjoyed my article, 5 Unsexy Truths About Working in Machine Learning.

The complexity of Machine Learning opens a lot of doors. It means that there is always new ways to try things, new knowledge to discover, new protocols/ensembles to invent. It will allow you to specialize in the fields you’re most interested in. If you’re willing to put in the work and struggle, you will soon be able to develop your own value-adds. And that’s when it gets fun. How to become a Machine Learning Expert is an article to help you speed up the process. As long as you’re willing to find areas you’re interested in and dive into them, you will be able to get great results in your Machine Learning Journeys.

If you liked this article, check out my other content. I post regularly on Medium, YouTube, Twitter, and Substack (all linked below). I focus on Artificial Intelligence, Machine Learning, Technology, and Software Development. If you’re preparing for coding interviews check out: Coding Interviews Made Simple.

For one-time support of my work following are my Venmo and Paypal. Any amount is appreciated and helps a lot:



Reach out to me

If that article got you interested in reaching out to me, then this section is for you. You can reach out to me on any of the platforms, or check out any of my other content. If you’d like to discuss tutoring, text me on LinkedIn, IG, or Twitter. If you’d like to support my work, using my free Robinhood referral link. We both get a free stock, and there is no risk to you. So not using it is just losing free money.

Check out my other articles on Medium. :

My YouTube:

Reach out to me on LinkedIn. Let’s connect:

My Instagram:

My Twitter:

If you’re preparing for coding interviews:

Get a free stock on Robinhood:




Data Scientists must think like an artist when finding a solution when creating a piece of code. ⚪️ Artists enjoy working on interesting problems, even if there is no obvious answer ⚪️ 🔵 Follow to join our 18K+ Unique DAILY Readers 🟠

Recommended from Medium

This neural network predicts material performance

Machine learning

Categorical Features in Machine Learning

Objective Functions in Deep Learning

Introduction to Coreference Resolution in NLP

Creating a Bipartite Graph for a User-Item Dataset


Diving Head-First Into The Dark Pool Problem

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Devansh- Machine Learning Made Simple

Devansh- Machine Learning Made Simple

I write high-performing code and scripts for organizations to help them generate more revenue, identify areas of investment, isolate redundancies, and automate

More from Medium

The Best Machine Learning Company of 2021

Machine Learning Engineering Is HARD

Four Deep Learning papers from late 2021 that will have a significant impact on 2022

Stop Using Accuracy to Assess your ML Models.