Python Memory and Multiprocessing
Here’s how you can speed up CPU-bound programs

Python is powerful. Python is used widely in the data science and machine learning community. Python is versatile. It’s widely used by large technology companies such as DropBox, Google, Instagram, and Spotify. Python enables developers to be productive quickly.
Need I say more? Python is a great programming language.
However, experienced Python developers will tell you to watch that memory usage. You’ve got to program mindfully to get the most out of Python. In data science and machine learning, there’s a need for speeding up CPU-bound programs. It’s often done by leveraging the art of multiprocessing.
Use the Latest and Greatest
Before we dive in, are you using Python 3 instead of Python 2? Do you know the newest tricks in Python 3? Starting with Python 3.7, there are significant performance improvements.
Here’s the list of optimizations from RealPython:
- Less overhead in calling many methods in the standard library.
- Method calls are up to 20% faster.
- The startup time of Python itself is reduced by 10–30%.
- Importing
typing
is seven times faster.
Understanding Python Memory Management
First, it’s good to get a solid understanding of how Python management works. You can start by reading this blog post. Then, to get an idea of how the Global Interpreter Lock works, and CPython’s Memory Management, read this blog post on RealPython.
The Art of Multiprocessing
Now you have a basic understanding of how Python memory management works, you know that you have to sidestep the Global Interpreter Lock. In Python, instead of using threads, you can use multiprocessing.
Multiprocessing is a part of concurrent programming. You can read about the big picture and the difference between speeding up I/O-bound programs and CPU-bound programs here.
In machine learning and data science, your code is often CPU bound, meaning that the program’s bottleneck is at the CPU level. This means that multiprocessing is often the better method to use for performance enhancement than other methods. By leveraging the power of multiple CPUs and multiple cores on a single machine, you can improve the performance of your programs.
Multiprocessing in Python will allow you to:
- run processes in separate memory spaces.
- take advantage of multiple CPUs and cores.
- communication model.
- give each process its own Python Interpreter and its own GIL.
The real advantage of multiprocessing is that you don’t get a lot of the usual multithreading problems such as data corruption and deadlocks. Although you will have a larger memory footprint than with a multithreading model, it’s a good tradeoff.
It is possible to share data between processes. You can read about this here.
Getting started with multiprocessing is easy, here are a few resources:
- Python 201: A multiprocessing tutorial
- Python Multiprocess Tutorial: Run Code in Parallel Using the Multiprocessing Module — Corey Shafer’s youtube video that contains an example of image processing.
- Dead Simple Example of Using Multiprocessing Queue, Pool, and Locking — Stackoverflow thread.
The one thing you have to remember with multiprocessing is that the entire memory is copied into a subprocess. That’s why memory profiling is important.
Memory Profiling
If you haven’t used the Python memory-profile package, now’s the time to start using it.
Here are a few resources that will help you understand memory leaks (particularly in reference to data science/machine learning work):
When Multiprocessing is Not Enough
Experienced developers who use Python multiprocessing library often use it for different types of machine learning and data science projects.
However, bench-marking actual examples to see whether they can handle numerical data and stateful computing is a good exercise.
If you haven’t used charm4py, you should try it while reading Juan Galvez’s arguments in the above article to get a complete picture of this debate.
With Python 3’s performance enhancements, Python is faster than ever. Understanding Python memory management, and taking full advantage of multiprocessing, will allow you to speed up your Python CPU bound programs using multiple CPUs or multiple cores. At the same time, you can keep track of memory usage with memory profiling.
With careful consideration, your Python programs will run faster than ever.