Python’s GIL — A Hurdle to Multithreaded Program

Impact of Python’s GIL on multithreaded program explained!

Published in
3 min readJul 24, 2019

--

Threads are a common programming construct. A thread is a separate flow of execution. This means that our program will have two things happening at once. In Python, by default programs run as a single process with a single thread of execution; this uses just a single CPU.

It’s tempting to think of threading as having two (or more) different processors running on our program, each one doing an independent task at the same time. That’s almost right. In Python, the threads may be running on different processors, but they will only be running one at a time.

Python’s Global Interpreter Lock (GIL)

CPython (the standard python implementation) has something called the GIL (Global Interpreter Lock); the GIL prevents two threads from executing simultaneously in the same program.

The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter.

The GIL limits parallel programming in Python out of the box. Since the GIL allows only one thread to execute at a time, even in a multi-threaded architecture with more than one CPU core, the GIL has gained a reputation as an “infamous” feature of Python. (Refer here to know more about it)

In this article we’ll learn how the GIL affects the performance of our multithreaded Python programs.

Because of the way CPython implementation of Python works, threading may not speed up all tasks. Again, this is due to interactions with the GIL that essentially limit one Python thread to run at a time. Problems that require heavy CPU computation might not run faster at all. This means that when we reach for threads to do parallel computation and speedup our Python programs, we will be solely disappointed.

Let’s use a naive number factorization algorithm to perform some computation intensive task.

def factorize(number):
for
i in range(1, number + 1):
if
number % i == 0:
yield i
>>> list(factorize(45))
[1, 3, 5, 9, 15, 45]

Above method will return list of all the factors of the number. Factorizing a set of numbers in serial takes a long time.

from time import timenumbers = [8402868, 2295738, 5938342, 7925426]
start = time()
for number in numbers:
list(factorize(number))
end = time()
print ('Took %.3f seconds' % (end - start))
>>>
Took 1.605 seconds

As above, executing serially took ~1.6 secs. Using multiple threads to do above computation would make sense in other languages because we can take advantage of all the CPU cores. Let’s try the same in Python and log the time again.

from threading import Threadclass FactorizeThread(Thread):
def __init__(self, number):
super().__init__()
self.number = number

def run(self):
self.factors = list(factorize(self.number))

Above code will create a thread for factorizing each number in parallel. Let’s start a few threads to log time of computation.

start = time()
threads = []
for number in numbers:
thread = FactorizeThread(number)
thread.start()
threads.append(thread)
# wait for all thread to finish
for thread in threads:
thread.join()
end = time()
print('Took %.3f seconds' % (end - start))
>>>
Took 1.646 seconds

What’s surprising is that, it took even longer than running factorize in serial. This demonstrates the effect of the GIL on programs running in the standard CPython interpreter. Therefore it’s not recommended to use multithreading for CPU intensive tasks in Python. (multiprocessing is the fair alternative)

Having said that, we should not treat GIL as some looming evil. It’s a design’s choice. The GIL is simple to implement and was easily added to Python. It provides a performance increase to single-threaded programs as only one lock needs to be managed. Removing GIL would complicate the interpreter’s code and greatly increase the difficulty for maintaining the system across every platform.

If not for CPU intensive tasks, Python threads are helpful in dealing with blocking I/O operations including reading & writing files, interacting with networks, communicating with devices like displays etc. These tasks happen when Python make certain types of system calls. Tasks that spend much of their time waiting for external events are generally good candidates for threading.

Key Highlights:

  • Python threads can’t run in parallel on multiple CPU cores because of the global interpreter lock (GIL).
  • Use Python threads to make multiple system calls in parallel. This allows us to do blocking I/O at the same time as computation.

--

--

Sports Enthusiast | Senior Deep Learning Engineer. Python Blogger @ medium. Background in Machine Learning & Python. Linux and Vim Fan