Make your parallel NumPy code fast: the secret sauce

Using NumPy efficiently between processes

When dealing with parallel processing of large NumPy arrays such as image or video data, you should be aware of this simple approach to speeding up your code.

Benjamin Lowe
Analytics Vidhya
Published in
7 min readDec 28, 2020

--

Image by the author on Canva

Multiprocessing versus Concurrency in Python

First, a quick primer on some terminology.

In Python, if we want to take full advantage of the processing power of your CPU, you need need to use multiprocessing (typically achieved via the multiprocessing library). This library is therefore well suited for CPU intensive tasks. If we wish to efficiently do many things at once using a single processor, i.e. achieve concurrency, we can use Python’s libraries for asynchronous work — namely threading or asyncio. In both cases, a common programming practice for sharing information safely between the processes/threads is via the use of a Queue.

There are two types of Queue you should be familiar with:

  • queue.Queue

The queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be…

--

--