We use Python a fair bit at Zendesk for building machine learning (ML) products. One of the common performance issues we encountered with machine learning applications is memory leaks and spikes. The Python code is usually executed within containers via distributed processing frameworks such as Hadoop, Spark and AWS Batch. Each container is allocated a fixed amount of memory. Once the code execution exceeds the specified memory limit, the container will terminate due to out of memory errors.
A quick fix is to increase the memory allocation. However this can result in wastage in resources and affect the stability of the products due to unpredictable memory spikes. The causes of memory leaks can include:
- lingering large objects which are not released
- reference cycles within the code
- underlying libraries/C extensions leaking memory
A useful exercise is to profile the memory usage of the applications to gain better understanding on space efficiency of the code and the underlying packages used. This post covers:
- profiling memory usage of the application across time
- how to inspect memory usage at specific part of the program
- tips for debugging memory issues
Profiling Memory Across Time
You can look at the memory usage varying across time during the execution of the Python code using the memory-profile package.
# install the required packages
pip install memory_profiler
pip install matplotlib# run the profiler to record the memory usage
# sample 0.1s by defaut
mprof run --include-children python fantastic_model_building_code.py# plot the recorded memory usage
mprof plot --output memory-profile.png
The option include-children will include the memory usage of any child processes spawned via the parent process. Graph A shows an iterative model training process which causes the memory to increase in cycles as batches of training data being processed. The objects are released once garbage collection kicks in.
If the memory usage is constantly growing, there is a potential issue of memory leaks. Here’s a dummy sample script to illustrate this.
A debugger breakpoint can be set once memory usage exceeds certain threshold using the option pdb-mmem which is handy for troubleshooting.
Memory Dump at a Point in Time
It is important to understand the expected number of large objects in the program and whether they should be duplicated and/or transformed into different formats.
To further analyse the objects in memory, a heap dump can be created during certain lines of the code in the program with muppy.
# install muppy
pip install pympler# Add to leaky code within python_script_being_profiled.py
from pympler import muppy, summary
all_objects = muppy.get_objects()
sum1 = summary.summarize(all_objects)# Prints out a summary of the large objects
summary.print_(sum1)# Get references to certain types of objects such as dataframe
dataframes = [ao for ao in all_objects if isinstance(ao, pd.DataFrame)]for d in dataframes:
Another useful memory profiling library is objgraph which can generate object graphs to inspect the lineage of objects.
Strive for quick feedback loop
A useful approach is creating a small “test case” which runs only the memory leakage code in question. Consider using a subset of the randomly sampled data if the complete input data is lengthy to run.
Run memory intensive tasks in separate process
Python does not necessarily release memory immediately back to the operating system. To ensure memory is released after a piece of code has executed, it needs to run in a separate process. This page provides more details on Python garbage collection.
Debugger can add references to objects
If a breakpoint debugger such as pdb is used, any objects created and referenced manually from the debugger will remain in the memory profile. This can create a false sense of memory leaks where objects are not released in a timely manner.
Watch out for packages that can be leaky
Some Python libraries could potentially have memory leaks. E.g. pandas have quite a few known memory leaks issues.
Memory Management - Python 3.7.2 documentation
At the lowest level, a raw memory allocator ensures that there is enough room in the private heap for storing all…