How Does the Python Interpreter Execute Your Code?

CodeInSeoul
5 min readJun 28, 2023

--

Introduction

Python is a beloved programming language in many fields. Most AI developers and researchers write Python code on the Tensorflow and PyTorch frameworks. Python is also widely used in data analytics with a rich set of data visualization tools like matplotlib. And thanks to Python’s feature enabling easy binding to C/C++, programmers can easily write code in Python while using C++ frameworks like Qt or ROS frameworks.

But do you know how the Python interpreter executes your code under the hood? If you are curious about the implementation of the Python interpreter, here’s a series of articles explaining and discussing the implementation of CPython, an implementation of Python written in C.

Python was created by a Dutch programmer named Guido van Rossum in 1991. When Python was first created, its implementation was written in C, called CPython. And still, CPython is the official release that you can download from the official website of the Python Software Foundation. There are many other Python implementations written in different programming languages, like Jython written in Java or PyPy written in Python itself. Here, we will explain the Python interpreter based on the official CPython implementation.

Build CPython from Source

In this series of articles, our explanation is based on CPython of version 3.19 and the Ubuntu environment. You can build CPython from source with the commands below.

Build CPython from source

The Beginning of the Python Interpreter

Everyone has their own way to understand code written by someone else. We will find the entrance where the code begins execution. In CPython, Programs/python.c is the entrance that is executed first when you type python on the terminal. Programs/python.c is shown below and it’s pretty simple, right? Here, main function in Line 13 encapsulates Py_BytesMain function in Line 15, which is defined in Modules/main.c, which is the top module of the Python interpreter.

Programs/python.c: main

If you chase the definition of Py_BytesMain function, it again encapsulates pymain_main function in Line 9, defined in the same file (Modules/main.c).

Modules/main.c: Py_BytesMain

pymain_main function first initializes the configurations by calling pymain_init function in Line 4, checks the returned status, and if everything is fine and ready to go, it calls Py_RunMain function in Line 13 to actually execute the Python code the programmer feeds.

Modules/main.c: pymain_main

Let’s first look at the part that initializes configurations, which is done by pymain_init function. The CPython configurations consist of three parts, which are defined in Include/cpython/initconfig.h:

  1. PyPreConfig dictionary configurations
  2. PyConfig runtime configurations
  3. Configurations that are used when compiling the Python interpreter

PyPreConfig configurations are related to the user environment or the operating system. One of the most important things that PyPreConfig does is to set the Python memory allocator. Pyconfig defines runtime configurations such as execution mode specifying the source of the Python code (from a file or stdin).

Modules/main.c: pymain_init

Now, let’s look at the part that executes the code, which is done by Py_RunMain function. Note that Py_RunMain function is called by pymain_main function after initializing various configurations. Py_RunMain executes the Python code by calling pymain_run_python in Line 6 and finalizes allocated resources afterward.

Modules/main.c: Py_RunMain

pymain_run_python loads the initialized PyConfig configuration and figures out which execution mode the Python interpreter should run. There are three different methods of feeding Python code to the interpreter: 1) file, 2) I/O stream, and 3) string. For example, if config->run_filename is set to true, then the Python interpreter calls pymain_run_file in Line 62 with PyConfig argument, executing code written in the file. We will look into this execution mode, so, let’s look at the definition of pymain_run_file function.

Modules/main.c: pymain_run_python

pymain_run_file function is a wrapper of pymain_run_file_obj function in Line 16.

Modules/main.c: pymain_run_file_obj

_PyRun_AnyFileObject handles two different modes: interactive loop mode and simple file mode. Since we are assuming a scenario where the programmer feeds code in a file format, the control goes to _PyRun_SimpleFileObject in Line 23.

Python/pythonrun.c: _PyRun_AnyFileObject

_PyRun_SimpleFileObject checks if the file is already bytecode. If so, _PyRun_SimpleFileObject calls run_pyc_file in Line 50 to execute the bytecode, and if not, it calls pyrun_file in Line 59 to execute the Python code in the file.

Python/pythonrun.c: _PyRun_SimpleFileObject

pyrun_file creates PyArena in Line 5, which allocates and manages memory for Python objects. And pyrun_file constructs an Abstract Syntax Tree (AST) in Line 11 from the code in the input file. Then, it’s finally ready to call run_mod function in Line 20 to execute the code in the AST form and with PyArena.

Python/pythonrun.c: pyrun_file

Now that we have looked into the flow from typing python into the terminal to the beginning of the compilation phase of the input Python code, let’s take a look at how CPython conducts lexical and syntax analysis with its lexer and parser in our next article!

--

--

CodeInSeoul

Articles on system programming and how computers work • Led by Computer Architecture PhD Student :: www.codeinseoul.com