How to Write Memory Efficient Loops in Python

Siavash Yasini
Analytics Vidhya
Published in
5 min readJun 30, 2020

--

A visual guide to generators and three ways to implement them

Photo by Tine Ivanič on Unsplash

In python, when you build a list of numbers, images, files, or any other object that you want to iterate through, you’re essentially piling up memory as you put new items on the list, i.e. every time you do your_list.append(new_item) your list consumes a chunk of memory equal to sys.getsizeof(new_item). The problem here is that if your list is too long or your items are too large (or a combination of the two) you might end up using too much memory or even run out of it.

However, in most situations when you iterate through a list you only need to have access to the items one at a time and not everything all at once. An example that I run into a lot in my line of work is stacking images: Let’s say I have about 500 2D images each with 1000 pixels on the side and I want to stack all of them together and take an average (replace this with your favorite memory consuming example, e.g. loading these images from file to train you awesome CNN model).

One way to approach the stacking problem would be to build a list of all the images and then take the average! Very simple but not memory efficient. Putting 500 images with (1000 x 1000) np.float64 pixels takes about 4 GB of memory, which at first sight is not that big a deal. But what if you had 5000 images instead…

--

--

Siavash Yasini
Analytics Vidhya

Ungifted Amateur, Python Enthusiast, Latte Artist, Ex-Cosmologist, Sr Data Scientist @ Fanatics