From 0 to Senior in Python: Python under the hood — the interpreter and memory managing

Olof Baage
6 min readJan 4, 2024

--

In order to write efficient and fast code, it is crucial to understand what is actually going on under the hood and how Python interprets your code. That understanding will help you write better algorithms.

Photo by Shamin Haky on Unsplash

Hi fellow coder. 👋 Awesome that you read an article of my series ‘From 0 to Senior in Python”. 🤩 Here is what I’m going to cover:

  • How Python gets interpreted
  • How Python uses memory
  • Immutable vs. Mutable Data Types

How Python gets interpreted

Python is a hybrid language. That means, that is interpreted as well as compiled. Good to know. 😏 But how does all of that work? Let’s look at the easiest possible python file:

print('hello, world');

To execute this file yo use the command:

python3 hello-world.py 

# Expected Output: hello, world

But what happens here? With the command python3 you start the python interpreter. The python interpreter is very nice. It basically says to us: “Don’t you worry about compiling and interpreting python. Just give me the file name. I take care of that for you.”😏 It does the job quite and fast in the background and eventually shows us the result in the console. What is actually, that happens here under the hood? Let’s take a closer look at the interpreter:

The Python Interpreter, © code-by-olof
The Python Interpreter, © code-by-olof

The Python Interpreter consist of two parts: The Compiler and the the PVM (Python Virtual Machine).

When you start the interpreter, at first the the compiler grabs your code and converts it into bytecode. Then in a second step the Python Virtual Machine taks the code and interprets it to machine code. Which PVM you have and how the machine code looks in the end depends your specific system.

If you use the command python3 myfile.py to run the file, you never get to see the bytecode. But we can do that also manually.

python3 -m py_compile hello-world.py

With that command with compile the python file hello-world.py to bytecode. The compiler creates a new folder called __pycahce__. In there your find a file that’s called something like that: hello-world.cpython-312.pyc

We can open that file with the command:

cat hello-world.cpython-312.pyc

If you’re using windows, you use type instead of cat.

We get an output like that:


�ەe���ed�y)z
���n�r%

Yes, that is bytecode. Not very human friendly. But the PVM can read and interpret it.

There is on thing you can do, if you want to learn more about bytecode and what happens under the hood. We can compile it to bytecode instructions:

python3 -m dis hello-world.py 

That gives us the following output:

 0           0 RESUME                   0

1 2 PUSH_NULL
4 LOAD_NAME 0 (print)
6 LOAD_CONST 0 ('hello, world')
8 CALL 1
16 POP_TOP
18 RETURN_CONST 1 (None)

Isn’t that exciting? Sadly it would go beyond the scope of this article, to explain bytecode instructions to you. But I’m planing to write another article about that. It is just super interesting to dive deeper and deeper into it and understand what actually happens. And the better you understand those things, the more you grow as a developer. So follow me, here or on LinkedIn if you’re interested to learn more about Python, JavaScript, PHP and Web Development in general.

How Python uses memory

If you run a Python program, it allocates memory to the variables. Well, actually that is not quite right. It allocates memory to values. The variables only hold the address of that. To make that somewhat clearer, let’s jump right into an example.

x = 42;
y = 19;
print(hex(id(x)));
print(hex(id(y)));
print(x);
print(y, end='\n\n');

y = x;
print(hex(id(x)));
print(hex(id(y)));
print(x);
print(y, end='\n\n');

x = 23;
print(hex(id(x)));
print(hex(id(y)));
print(x);
print(y);

# Expected Output:
# 0x10f86f400
# 0x10f86f120
# 42
# 19

# 0x10f86f400
# 0x10f86f400
# 42
# 42

# 0x10f86f1a0
# 0x10f86f400
# 23
# 42

# In your case, of course, the memory address will differ.

In the example above I created two variables: xand y.
The I print their memory addresses and their values. Both variables have different values and hence different memory addresses. That is what you expected. Hopefully.

Then I assign the value of x to y and print the memory addresses and the values of the variables again. Now x and y have the same value and the same memory address. Yeah, that is what we wanted.

But after that I changed the value of x. Does the value of y also change? It should, shouldn’t it? They both have the same memory address. So if I change the value that is stored in that memory address, the value for both variables should change. But look, it doesn’t. 🤨 Y still holds the same memory address and its value did not change. X on the other hand has the new value and a new memory address. Why does that happen? 🤔

If you assign a value to a var, Python looks in the memory if that exact value is already stored somewhere. If that is the case, then the variable, which you assigned the value to, will point to that memory address. If that value is not saved in the memory, then Python will allocate memory to it. It doesn’t matter if it is a completely new variable or if the variable existed already and a new value is assigned to it — the new value will have another place in the memory than the old one.

That said, there is another question that might pop up in your head now? Does that mean, we have several “copies” of strings, integers, floats etc. in the memory once the value changes? Yes, that is because strings, integers and float are actually immutable objects! Wait what? 😳 Yes, you heard me correct here. They are immutable!

Immutable vs. Mutable Data Types

Let’s have a quick look at data types:

print(type(42));
print(type(3.14159));
print(type("hello, world"));
print(type(True));
print(type(['eins', 'zwei', 'drei']));

# Expected Output:
# <class 'int'>
# <class 'float'>
# <class 'str'>
# <class 'bool'>
# <class 'list'>

All the data types are object, or more precisely instances. Every time you create a string, integer, float, or tuple, you create an instance of the object. And they are immutable because of the way, memory is used. We can take a look at an example with strings:

x = "hello, world";
y = x;
print(hex(id(x)));
print(hex(id(y)));
print(x);
print(y, end='\n\n');
x = "hello, olof";
print(hex(id(x)));
print(hex(id(y)));
print(x);
print(y, end='\n\n');

# Expected Output:
# 0x103b29fb0
# 0x103b29fb0
# hello, world
# hello, world

# 0x103b2aef0
# 0x103b29fb0
# hello, olof
# hello, world

# In your case, of course, the memory address will differ.

We have the same result as we had in the example with numbers. At first, they have the same memory address. Once you change a string, it holds a new memory address.

Let’s sum it up

  • Python is a hybrid language, it is compiled and interpreted
  • Python is a cross-platform language. Hence every platform has its own PVM (Python Virtual Machine)
  • A variable in Python does not hold the value, it only holds the memory address to a value.
  • Strings, numbers, and sets are instances of imutables objects
  • List, dictionaries and set are instances of mutable objects.

If you learned something from this article, leave me clap, a comment or share this article to support me. I welcome you to follow me here or on LinkedIn.

Happy coding.

--

--

Olof Baage

Aspiring Full Stack Developer / Technical Writer / Passionate Learner / HTML, CSS, JavaScript, Node.js, Python, React, Vue.js, SQL, PHP…