Circular references without memory leaks and destruction of objects in Python

Sergei
5 min readFeb 5, 2018

--

Hello everyone!

In this article you will see how to control the destruction of objects in Python, and to build the architecture with cross-references objects in such way in order it does not lead to memory leaks.

First, I recommend to set up the environment for experiments. You need to open the terminal (for example, in Ubuntu) and run the following commands:

$ sudo apt-get install python-dev python3-dev
$ virtualenv .v2 -p python2.7
$ . .v2/bin/activate
$ pip install psutil memory-profiler

and for 3rd python:

$ virtualenv .v3 -p python3
$ . .v3/bin/activate
$ pip install psutil memory-profiler

In Python, there is a garbage collector that removes an object from memory (frees memory for new objects), if the object is not referenced by any variable. For example, the following code:

class A(object):    def __del__(self):
print("delete")
def main():
a = A()
print("end")
if __name__ == "__main__":
main()

If to save it and execute, will give the result:

$ python mem.py
end
delete

Method __del__ is called on object when garbage collector destroys it. In this case, it happened at the moment of function exit, when variable a ceased to exist and nobody else referred to the instantiated object.

After modification:

class A(object):    def __del__(self):
print("delete")
def main():
a = A()
del a
print("end")
if __name__ == "__main__":
main()

The result will be another:

$ python mem.py
delete
end

The del statement deletes the variable, which causes the object references to be reset to zero and garbage collector deletes it.

If two objects refer to each other via their attributes, then the garbage collector will not delete them, even if no one variable in the executable code references to these objects. In this case, they say about the “leak” of memory (in python3, the garbage collector still removes them, but it does not save from leakage nevertheless).
These circular references should be avoided when an application architecture is building. In general, it is normal practice to break code with one-way abstraction levels when the lower level knows nothing about the higher level and the higher one uses the lower level for any action.
However, sometimes in the case of lists or trees, it is useful to organize bi-directional communication, when not only the parent knows about their descendants, but also the descendant can contact his parent. This is true for dynamic calculations, when child method call is affected with the value obtained from the parent. For example, modifying the previous code:

class A(object):    def __init__(self, name, parent=None):
self.name = name
self.parent = parent
self.children = set()
def __del__(self):
print("delete", self.name)
def main():
a = A(name=1)
a.children.add(A(name=2, parent=a))
print("end")
if __name__ == "__main__":
main()

The result will be:

$ python mem.py
end

The objects were not deleted at the end because they referred to each other. If the number of circular references grows in the code, it will increase the amount of memory consumed until it runs out. For example:

from memory_profiler import profileclass A(object):    def __init__(self, name, parent=None):
self.name = name
self.parent = parent
self.children = set()
self.workload = ' ' * 128 * 1024 * 1024
def __del__(self):
print("delete", self.name)
@profile
def main():
for _ in range(10):
a = A(name=1)
a.children.add(A(name=2, parent=a))
print("end")
if __name__ == "__main__":
main()

It will require nearly 2.5 Gb of memory:

$ python -m memory_profiler mem.py
end
Filename: mem.py
Line # Mem usage Increment Line Contents
================================================
16 13.2 MiB 0.0 MiB @profile
17 def main():
18 2569.4 MiB 2556.2 MiB for _ in range(10):
19 2441.4 MiB -128.0 MiB a = A(name=1)
20 2569.4 MiB 128.0 MiB a.children.add(A(name=2, parent=a))
21 2569.6 MiB 0.2 MiB print("end")

The removal of objects has not occurred, and the memory was free by operating system — Python processes was finished. As mentioned above, python3 is able to recognize objects which point to each other and remove them on completion, but at the execution time, memory will “leak” still. The above script when you run python3:

$ python -m memory_profiler mem.py
end
Filename: mem.py
Line # Mem usage Increment Line Contents
================================================
16 13.2 MiB 0.0 MiB @profile
17 def main():
18 2573.1 MiB 2559.9 MiB for _ in range(10):
19 2445.3 MiB -127.7 MiB a = A(name=1)
20 2573.1 MiB 127.7 MiB a.children.add(A(name=2, parent=a))
21 2573.1 MiB 0.0 MiB print("end")
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2

It is seen that objects were deleted by the garbage collector calling a method __del__.

To avoid memory leaks, you need to use weak references (weakref module). Weak references do not affect reference counting by the garbage collector. It is correct to use weak references when the child references the parent. This leads that on references reset, together with the root deletes all descendants cascading, provided that no one external variable refers to them. A similar cascade model works in Qt.

import weakreffrom memory_profiler import profileclass A(object):    def __init__(self, name, parent=None):
self.name = name
self._parent = weakref.ref(parent) if parent else parent
self.children = set()
self.workload = ' ' * 128 * 1024 * 1024
@property
def parent(self):
if not self._parent:
return self._parent
_parent = self._parent()
if _parent:
return _parent
else:
raise LookupError("Parent was destroyed")
def __del__(self):
print("delete", self.name)
@profile
def main():
for _ in range(10):
a = A(name=1)
a.children.add(A(name=2, parent=a))
print("end")
if __name__ == "__main__":
main()

Memory consumption has dropped dramatically and will remain constant even if the number of cycles increases:

$ python -m memory_profiler mem.py
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
delete 1
delete 2
end
delete 1
delete 2
Filename: mem.py
Line # Mem usage Increment Line Contents
================================================
28 13.0 MiB 0.0 MiB @profile
29 def main():
30 269.2 MiB 256.1 MiB for _ in range(10):
31 141.3 MiB -127.9 MiB a = A(name=1)
32 269.2 MiB 127.9 MiB a.children.add(A(name=2, parent=a))
33 269.1 MiB -0.1 MiB print("end")

As you can see, garbage collection occurs in each iteration at the time of re-assignment of variable a. It is also easy to guess why garbage collection occurs after end output.

Looking at the method __del__, you might be tempted to use it for various finishing operations, e.g. closing the channel, socket, file, stop the server, proxy, etc. However, never to need in __del__ to use functions, potentially leading to exceptions. If any object method generates an exception, the Python interpreter keeps a reference to the object in traceback for subsequent output of the error location. Therefore, the object will exist until the process is complete. For example, code:

class A(object):    def method(self):
pass
def __del__(self):
print("delete")
def main():
a = A()
try:
a.method()
finally:
del a
print("end")
if __name__ == "__main__":
main()

Will give:

$ python mem.py
delete
end

But in case of exception:

class A(object):    def method(self):
raise Exception()
def __del__(self):
print("delete")
def main():
a = A()
try:
a.method()
finally:
del a
print("end")
if __name__ == "__main__":
main()

The object destruction will be only at the end:

$ python mem.py
end
Traceback (most recent call last):
File "mem.py", line 20, in <module>
main()
File "mem.py", line 13, in main
a.method()
File "mem.py", line 4, in method
raise Exception()
Exception
delete

Worth to remember that del obj does not mean thatobj.__del__() will be called immediatelly. The del statement only reduces the number of object references by 1. The __del__() method will be called only when the number of references is 0.

--

--

Sergei

Software Engineer. Senior Backend Developer at Pipedrive. PhD in Engineering. My interests are IT, High-Tech, coding, debugging, sport, active lifestyle.