Make your parallel NumPy code fast: the secret sauce
Using NumPy efficiently between processes
When dealing with parallel processing of large NumPy arrays such as image or video data, you should be aware of this simple approach to speeding up your code.
Multiprocessing versus Concurrency in Python
First, a quick primer on some terminology.
In Python, if we want to take full advantage of the processing power of your CPU, you need need to use multiprocessing (typically achieved via the multiprocessing
library). This library is therefore well suited for CPU intensive tasks. If we wish to efficiently do many things at once using a single processor, i.e. achieve concurrency, we can use Python’s libraries for asynchronous work — namely threading
or asyncio
. In both cases, a common programming practice for sharing information safely between the processes/threads is via the use of a Queue.
There are two types of Queue you should be familiar with:
queue.Queue
The queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be…