Hunting for Memory Leaks in Python applications

Wai Chee Yau
Feb 13 · 4 min read

We use Python a fair bit at Zendesk for building machine learning (ML) products. One of the common performance issues we encountered with machine learning applications is memory leaks and spikes. The Python code is usually executed within containers via distributed processing frameworks such as Hadoop, Spark and AWS Batch. Each container is allocated a fixed amount of memory. Once the code execution exceeds the specified memory limit, the container will terminate due to out of memory errors.

A quick fix is to increase the memory allocation. However this can result in wastage in resources and affect the stability of the products due to unpredictable memory spikes. The causes of memory leaks can include:

  • lingering large objects which are not released
  • reference cycles within the code
  • underlying libraries/C extensions leaking memory

A useful exercise is to profile the memory usage of the applications to gain better understanding on space efficiency of the code and the underlying packages used. This post covers:

  • profiling memory usage of the application across time
  • how to inspect memory usage at specific part of the program
  • tips for debugging memory issues

Profiling Memory Across Time

You can look at the memory usage varying across time during the execution of the Python code using the memory-profile package.

# install the required packages
pip install memory_profiler
pip install matplotlib
# run the profiler to record the memory usage
# sample 0.1s by defaut
mprof run --include-children python fantastic_model_building_code.py
# plot the recorded memory usage
mprof plot --output memory-profile.png
A. Memory profile as a function of time

The option include-children will include the memory usage of any child processes spawned via the parent process. Graph A shows an iterative model training process which causes the memory to increase in cycles as batches of training data being processed. The objects are released once garbage collection kicks in.

If the memory usage is constantly growing, there is a potential issue of memory leaks. Here’s a dummy sample script to illustrate this.

B. Memory footprints increasing across time

A debugger breakpoint can be set once memory usage exceeds certain threshold using the option pdb-mmem which is handy for troubleshooting.

Memory Dump at a Point in Time

It is important to understand the expected number of large objects in the program and whether they should be duplicated and/or transformed into different formats.

To further analyse the objects in memory, a heap dump can be created during certain lines of the code in the program with muppy.

# install muppy
pip install pympler
# Add to leaky code within python_script_being_profiled.py
from pympler import muppy, summary
all_objects = muppy.get_objects()
sum1 = summary.summarize(all_objects)
# Prints out a summary of the large objects
summary.print_(sum1)
# Get references to certain types of objects such as dataframe
dataframes = [ao for ao in all_objects if isinstance(ao, pd.DataFrame)]
for d in dataframes:
print d.columns.values
print len(d)
Example of summary of memory heap dump

Another useful memory profiling library is objgraph which can generate object graphs to inspect the lineage of objects.

Useful Pointers

A useful approach is creating a small “test case” which runs only the memory leakage code in question. Consider using a subset of the randomly sampled data if the complete input data is lengthy to run.

Python does not necessarily release memory immediately back to the operating system. To ensure memory is released after a piece of code has executed, it needs to run in a separate process. This page provides more details on Python garbage collection.

If a breakpoint debugger such as pdb is used, any objects created and referenced manually from the debugger will remain in the memory profile. This can create a false sense of memory leaks where objects are not released in a timely manner.

Some Python libraries could potentially have memory leaks. E.g. pandas have quite a few known memory leaks issues.

Happy hunting!

References

Zendesk Engineering

Engineering @ Zendesk

Thanks to Ryan Seddon and Dana Ma

Wai Chee Yau

Written by

Software engineer at Zendesk. Enjoy data wrangling, continuous learning of machine learning, exploring street food and travelling.

Zendesk Engineering

Engineering @ Zendesk

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade