Python memory management

Dheeraj Thodupunuri
Analytics Vidhya
Published in
7 min readJul 23, 2021

Writing an efficient code means writing memory-efficient code.

This article describes memory management and garbage collection in python 3.6.

It is very important for any software developer to understand how memory allocation happens and how that memory is managed.

In this we will go through :

  • What is memory management?
  • Why memory management is required?
  • How memory is allocated in Python?
  • Garbage collection in Python

What is memory management?

Memory management in simple terms means , the process of providing memory required for your program to store data and freeing up unused data in memory is called memory management.

Providing memory is called memory allocation. Freeing up memory is called memory de-allocation.

In Python , Memory manager is responsible for allocating and de-allocating memory.

Why memory management is required?

Generally , programming languages uses objects to operate on data required by your program. These objects are created in-memory for faster access . So , once an object is created it is allocated some space on memory , once your program completes its execution these objects has to be clean up or deleted from the memory because they are no longer in use which can re-used again for other processes/program for execution.

If these un-used objects are not cleaned up , then your memory might be full and there will not be enough space for other programs and your application might crash. So , memory management is very important in any programming language.

In early programming languages (like C), it was developers responsibility to allocate memory and de-allocate memory after the execution which led to below problems:

Forgetting to free up memory — If developer forgets to free up un-used memory , then memory might become full which leads to your program using too much memory.

Freeing memory which is already in use — If developer by mistake frees up memory which is already in use , which causes an issue when your program tries to access same memory which results in unexpected behaviour.

So , these problems led new programming languages to implement automatic Memory Management and Garbage Collection which is taken care by Python Memory Manager in Python.

In python memory allocation and de-allocation is automatic.

How memory is allocated in Python?

So , as said in python memory manager is responsible for allocating and de-allocating the memory.

Memory has two parts stack memory and heap memory(which has nothing to do with heap data structure)

Stack Memory —all the methods/method calls , references are store in stack memory.

Heap Memory — all the objects are stored in heap.

Every thing in python is an Object. So ,it is very important to understand about objects in python.

Python is a dynamically typed language , which means types are assigned based on the value it is referring to unlike other programming languages(Java and C#).

For example in other programming languages like Java/C# , you cannot create a variable without specifying the type to that variable.

In C# we create a variable as , public int <variable_name>

Dynamically typed language

In above example , I have created a variable with name “x” initially assigned to None . (None in python equivalent to null in other programming languages). When “x” is assigned to “None” , the type of “x” is “None”.

When “x” is reassigned to 10 , the type of “x” is “int’.

When “x” is again reassigned to “10” , the type of “x” is “str”.

Unlike in other programming languages , in python whenever a variable is assigned a value , the python memory manager will check if an object with that value is already available in the memory . If object is already present in memory , then this variable points to that object instead of creating a new object with the same value.

If object with that value is not available in memory(heap) , python memory manager will create a new object with the specified value and this variable will point to this newly created object on heap(memory).

Also , when a variable is re-assigned with new variable , instead of overriding the value in memory , what python does is , it will again follow same process as above and checks if there is an object already present on heap with the new value . If object is already present , then this variable will point to that object or else python memory manager will create a new object on heap with new value and this variable will point to that value.

For example,

x == 100 // this will create a new object in heap
y == 100 // this will not create a new object as an object with value 100 is already available on heap
print(id(x) == id(y)) // this returns true because x and y are pointing to same object on heapx = 101 // now when new value is assigned "101" is not available on heap , so new object is created and x points to this new object . In this case value at that location is not overwritten unlike other programming languages.

Which is not the case with other programming languages where when a variable is updated ,the value at that memory location/address is overwritten with new update value

For , clear understanding on above discussed points , please click on below link which demonstrates above discussed points on python object memory allocation.

Object creation

When ever a new object is created in python , python memory manager ensures that there is enough memory in the heap to allocate space to that object.

In python , all objects are derived from PyObject a struct which has two properties reference count and pointer to the object

For more information on PyObject , refer below documentation.

Garbage Collection in Python

Now , it is time to clean up the objects which are not in use. The process of de-allocating the memory or deleting the un-used objects so that it can be made available to other objects is called Garbage Collection.

So , the job of the garbage collector is to track of the objects which can de-allocated.

Python uses below 2 algorithms for garbage collection:

  • Reference counting
  • Generational garbage collection

Reference Counting

Reference counting is a simple technique in which , whenever the reference count of an object reaches to “0” , then it is eligible for garbage collection and the memory allocated for that object is automatically de-allocated.

Whenever an object is created , the reference count of that object is incremented by “1” and similarly , when a reference to that object is removed , then its reference count is decremented by “1”. Finally , when the reference count of that object becomes “0” , the memory allocated for that object is de-allocated.

Python “sys” module , provides a method called “getrefcount(object)” ,which gives the count of references for a given object.

Go through below code example which demonstrates , in what scenarios reference count of an object increases and decreases.

Object reference count

By default python uses reference counting technique for garbage collection , which cannot be disabled (developer has no control over it).

But the issue with this technique is it has some overhead because , every object has to keep track of reference count for memory de-allocation and also memory de-allocation happens whenever an objects reference count becomes “0”.

Reference counting will not be able to detect the cyclic references and those objects will not be eligible for garbage collection.

Because of above problems , python also uses another technique called Generational Garbage Collection.

Generational Garbage Collection

This is also an automatic process in python , but unlike reference counting which cannot be disabled, generational garbage collection is optional and can also be triggered manually.

gc module in python is responsible for generational garbage collection.

In this technique , all python objects are classified into 3 categories:

  • Generation 0
  • Generation 1
  • Generation 2

Each generation has a predefined threshold , threshold is nothing but it is an indicator for garbage collector on when to invoke garbage collection.

You can check the default threshold by importing gc module as below:

You can also check the number of objects in each generation as below:

You can also manually call the garbage collection as below:

You can also set the threshold as below:

When a new object is created , that object is categorized into “generation 0”.

Garbage collection is triggered automatically when a generation reaches its threshold and whatever objects remain in that generation after garbage collection are promoted to older generation.

If there are 2 generations reached threshold , always garbage collection choses older generation and then younger generation.

--

--