Python Guru: Global Interpreter Lock (GIL)

5 min read2 days ago

The Python Global Interpreter Lock, or GIL, simply put, is a mutex (or lock) that protects access to Python objects and simultaneously allows only one thread to hold control of the Python interpreter. This means that only one thread can execute at any given time, even when running on a multi-core processor.

Impact of GIL on Multithreaded Programs

The impact of GIL is not evident when executing single-threaded programs. However, the biggest effect of GIL is that it can create significant bottlenecks for programs performing CPU-bound tasks or tasks that utilize multithreading. Let’s consider two examples below.

Example 1: A CPU-bound program that performs a simple countdown.

Output:
Completed in  3.0810108184814453

Example 2: Two threads running in parallel performing a simple countdown.

Output:
Completed in 3.1420132384019721

The program is executed on a PC with an Intel Core i3–10100 processor, 4 cores, and 8 threads, along with 32GB of RAM. As you can see from the two examples above, the single-threaded program runs slightly faster than the multithreaded program. This means that applying multithreading does not improve the execution time of the program. On the contrary, it increases the execution time due to the overhead of thread initialization (fork thread) at the beginning and (join thread) at the end of the program.

Why Does GIL Still Exist?

From the beginning, along with the development of CPython, GIL was designed to ensure thread safety in Python. It also laid the foundation that Guido van Rossum designed to provide flexibility in application development for this language.

#1. Compatibility

Many Python packages (and the main Python interpreter, CPython) use numerous C extensions, which are not thread-safe. There’s a risk that multiple threads may try to access the same resource, leading to significant negative impacts. GIL makes it safer to create and use C extensions, helping developers in the 90s start using Python for software development and promoting the acceptance of this architecture.

#2. Garbage Collection and Reference Counting

Another important reason relates to how Python handles garbage collection. Garbage collection is an automatic memory management process in which the interpreter tracks and reclaims memory occupied by objects that are no longer referenced or accessible in the program. In Python, there are two main methods for garbage collection, but the most prominent is reference counting.

Reference counting in Python is an efficient way to manage memory and ensure that resources are released when they are no longer used, helping to prevent memory leaks in Python programs. Reference counting works as follows:

Each object tracks the number of references pointing to it.
When the reference count of an object drops to zero, it means there are no more references to that object in the program.
This indicates that the object is no longer needed.
Python’s memory management system automatically reclaims the memory occupied by the object, effectively deleting it.
Without GIL, multiple threads running concurrently could manipulate the reference counts of objects simultaneously, leading to race conditions and memory leaks. GIL acts as a safeguard, allowing only one thread to execute Python bytecode at a time, preventing these potential issues.

An Example of Reference Counting in Python

At this point, you might wonder why GIL still exists despite its limitations.

The truth is, there have been many efforts in the past to remove GIL from CPython. However, this task is considered very challenging and requires significant effort, potentially leading to a complete restructuring of the Python ecosystem.

Some interpreters, like PyPy, developed from a limited subset of Python, still have GIL but focus on improving the execution speed of Python code through Just-In-Time (JIT) compilation at runtime. Other interpreters, such as Jython and IronPython, do not include GIL in their design because they leverage the runtime capabilities of the Java and .NET environments, respectively. However, the downside of using these interpreters is that many libraries in Python, written in C, are not supported as they are in CPython. These interpreters only support specific versions of Python to address narrow issues.

Efforts to improve Python’s performance are ongoing. Since Python 3.4, with the introduction of Asyncio, performance has significantly improved by allowing concurrent execution without needing multiple threads. Additionally, many initiatives have been undertaken to continuously enhance the capabilities of the language.

Conclusion

Everything that exists has two sides, and GIL is no exception. On one hand, it helps manage memory and makes it easier for developers to write code. On the other hand, it causes performance issues with multithreading in the context of modern multi-core, multi-threaded computers.

The existence of GIL will always be a hot topic when discussing Python’s issues.

However, despite its inherent problems, Python continues to shine in areas such as Big Data, AI, and web development, with many successful applications. The latest signals from Python updates suggest a promising future for the language.

In the next article, I will discuss ways to overcome the limitations of GIL to enhance the performance of your Python programs. Stay tuned!