Speed up Cosine Similarity computations in Python using Numba

Pranay Chandekar
Analytics Vidhya
Published in
2 min readDec 23, 2019

--

Interested in Machine Learning topics or need some help with them?

Get in touch — https://linktr.ee/pranaychandekar

Please find the editable jupyter notebook here.

As per IEEE Spectrum, Python continues to be the top programming language for the past three years. It is also the language used to build Machine Learning applications. Being an interpreter language, it speeds up the development. However, the same makes it slower during run time as it has to compile and execute each statement every time. This becomes a problem during scaling.

This brings us to the question — “To speed up, can we compile python code once like other compiler languages or at least parts of it?”, “Will that make it faster?”

The answer — Yes we can!

In this article, we will see, with the help of an experiment, how we can speed up our numerical computations in Python using Numba.

Solution — Numba

As per the website, Numba is an open-source JIT(Just In Time) compiler that translates a subset of Python and NumPy code into fast machine code. It is designed to be used with NumPy arrays and functions. It optimizes array-oriented and math-heavy python code.

To verify the claim made by Numba, I tried numba with one of the most commonly used functionality in Machine Learning to see the difference.
Cosine Similarity Computation.

Experiment

In this experiment, I performed cosine similarity computations between two 50 dimension numpy arrays with and without numba.

The cosine similarity python function.

Cosine Similarity Function

The same function with numba.

Cosine Similarity Function with Numba Decorator

I ran both functions for a different number of computations to observe the difference in computation time.

Results

Results

The difference is evident. The use of numba has made our computations multiple times faster.

Takeaways

In this write-up, we just scratched the surface. There is a lot more we can do with Numba. But I will leave that exploration to you. You can find the complete experiment with jupyter-notebook in the following repository.

Interested in such topics or need some help with them?

Get in touch — https://linktr.ee/pranaychandekar

Citations

  1. https://numba.pydata.org/
  2. https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019

--

--