Advantages of using NumPy over Python Lists

Features and performance gains of using NumPy for numerical operations

In this article, I will show a few neat tricks that come with NumPy, yet are must faster than vanilla python code.

Photo by Alex Chambers on Unsplash

Memory usage

The most important gain is the memory usage. This comes in handy when we implement complex algorithms and in research work.

array = list(range(10**7))
np_array = np.array(array)

I found the following code from a blog. I will be using this code snippet to compute the size of the objects in this article.

get_size(array) ====> 370000108 bytes ~ 352.85MB
get_size(np_array) => 80000160 bytes ~ 76.29MB

This is because NumPy arrays are fixed-length arrays, while vanilla python has lists that are extensible.

Speed

Speed is, in fact, a very important property in data structures. Why does it take much less time to use NumPy operations over vanilla python? Let’s have a look at a few examples.

Matrix Multiplication

In this example, we will look at a scenario where we multiply two square matrices.

from time import time
import numpy as np
def matmul(A, B):
N = len(A)
product = [[0 for x in range(N)] for y in range(N)]
for i in range(N):
for j in range(N):
for k in range(N):
product[i][j] += matrix1[i][k] * matrix2[k][j]
return product
matrix1 = np.random.rand(1000, 1000)
matrix2 = np.random.rand(1000, 1000)
t = time()
prod = matmul(matrix1, matrix1)
print("Normal", time() - t)
t = time()
np_prod = np.matmul(matrix1, matrix2)
print("Numpy", time() - t)

The times will be observed as follows;

Normal 7.604596138000488
Numpy 0.0007512569427490234

We can see that the NumPy implementation is almost 10,000 times faster. Why? Because NumPy uses under-the-hood optimizations such as transposing and chunked multiplications. Furthermore, the operations are vectorized so that the looped operations are performed much faster. The NumPy library uses the BLAS (Basic Linear Algebra Subroutines) library under in its backend. Hence, it is important to install NumPy properly to compile the binaries to fit the hardware architecture.

More Vectorized Operations

Vectorized operations are simply scenarios that we run operations on vectors including dot product, transpose and other matrix operations, on the entire array at once. Let’s have a look at the following example that we compute the element-wise product.

vec_1 = np.random.rand(5000000)
vec_2 = np.random.rand(5000000)
t = time()
dot = [float(x*y) for x, y in zip(vec_1, vec_2)]
print("Normal", time() - t)
t = time()
np_dot = vec_1 * vec_2
print("Numpy", time() - t)

The timings on each operation will be;

Normal 2.0582966804504395
Numpy 0.02198004722595215

We can see that the implementation of NumPy gives a much faster vectorized operation.

Broadcast Operations

Numpy vectorized operations also provide much faster operations on arrays. These are called broadcast operations. This is because the operations are broadcasted over the entire array using Intel Vectorized instructions (Intel AVX).

vec = np.random.rand(5000000)t = time()
mul = [float(x) * 5 for x in vec]
print("Normal", time() - t)
t = time()
np_mul = 5 * vec
print("Numpy", time() - t)

Let’s see how the running times look;

Normal 1.3156049251556396
Numpy 0.01950979232788086

Almost 100 times!

Filtering

Filtering includes scenarios where you only pick a few items from an array, based on a condition. This is integrated into the NumPy indexed access. Let me show you a simple practical example.

X = np.array(DATA)
Y = np.array(LABELS)
Y_red = Y[Y=='red'] # obtain all Y values with RED
X_red = X[Y=='red'] # feed Y=='red' indices and filter X

Let’s compare this against the vanilla python implementation.

X = np.random.rand(5000000)
Y = np.int64(10 * np.random.rand(5000000))
t = time()
Y_even = [int(y) for y in Y if y%2==0]
X_even = [float(X[i]) for i, y in enumerate(Y) if y%2==0]
print("Normal", time() - t)
t = time()
np_Y_even = Y[Y%2==0]
np_X_even = X[Y%2==0]
print("Numpy", time() - t)

The running times are as follows;

Normal 6.341982841491699
Numpy 0.2538008689880371

This is a pretty handy trick when you want to separate data based on some condition or the label. It is very useful in data analytics and machine learning.

Finally, let’s have a look at np.where which enables you to transform a NumPy array with a condition.

X = np.int64(10 * np.random.rand(5000000))
X_even_or_zeros = np.where(X%2==0, 1, 0)

This returns an array where even-numbered slots are replaced with ones and others with zeros.

These are a few vital operations and I hope the read was worth the time. I always use NumPy with huge numeric datasets and find the performance very satisfying. NumPy has really helped the research community to stick with python without levelling down to C/C++ to gain numeric computation speeds. Room for improvements still exists!

Cheers!

The Startup

Get smarter at building your thing. Join The Startup’s +793K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Anuradha Wickramarachchi

Written by

Blogger | Traveler | Programmer PhD Scholar

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +793K followers.

Anuradha Wickramarachchi

Written by

Blogger | Traveler | Programmer PhD Scholar

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +793K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store