The goal of this article is to raise the awareness for the cooperative multiple-inheritance paradigm in python. Not knowing about it could lead to bugs that are quite hard to investigate, and opposed to the usual pythonic zen, it hides some implicit behavior.
We’ll begin with an example for the bug that may raise, with it you’ll understand how this paradigm can meet your development. Afterwards, we’ll dive in and understand how, and when, should you embrace it into your code.
I assume the reader’s knowledge of python programming language. All snippets are written in with python3.5 syntax.
Let’s start with a simplified bug:
>> python3 cooperative.py
Traceback (most recent call last):
File "cooperative.py", line 20, in <module>
File "cooperative.py", line 17, in __init__
File "cooperative.py", line 8, in __init__
TypeError: __init__() missing 1 required positional argument: 'color'
One should wonder why is a “color” argument expected when we’re constructing ConstructionMachine that has nothing to do with Vehicle (or color). The answer lies in line 8: the call for super().__init__() from a ConstructionMachine class which has no parent.
What really happened?
Python determines “method resolution order” (MRO) for every class by C3 linearization algorithm. Basically, the algorithm takes the inheritance graph and applies topological sorting on it while maintaining local precedence order (by the inheritance declaration order from left to right) and direct parenthood.
This is used when you call a method of some instance — at first the method is looked in the instance’s class, but if not found then python looks up the inheritance graph (by that determined order). This order is determined during runtime, and you can access it. The C3 linearization algorithm promises a monotonic MRO.
The MRO for Tractor is:
Out: [__main__.Tractor, __main__.ConstructionMachine, __main__.Vehicle, object]
And the MRO for ConstructionMachine is:
Out: [__main__.ConstructionMachine, object]
The key issue here is that super() finds the parent according to the current MRO. So if we were looking for method rumble() in class Tractor (which doesn’t have it directly but it inherits it from ConstructionMachine), python will go over the MRO and find it. But then notice line 11, inside rumble() : a call for super() is used, and although the MRO for ConstructionMachine has no farther parent, the MRO that will be used is the current one (from Tractor !), which has Vehicle class afterwards. So ConstructionMachine’s rumble() will in fact call Vehicle.__str__ when called from a Tractor instance.
In our bug, __init__ is the method which is called with Tractor’s MRO and accordingly when we reach the ConstructionMachine.__init__ the super() call then triggers Vehicle.__init__ , that expects a “color” argument.
But why call super from __init__ if you have no parent?
Let’s assume no one uses super at all. Then in all of the __init__ functions every class should explicitly call ParentClass.__init__(self) to make sure the parent is constructed, as did our Tractor in lines 16–17. Now, python supports multiple-inheritance and as such it is also allows the “diamond-problem”. This is partially solved with the MRO mechanism because there is a deterministic way to decide which function to use, but if every class is calling __init__ explicitly, the class at the top of the diamond will be constructed twice!
For example, if Vehicle and ConstructionMachine will inherit Machine (as illustrated in the UML graph), then when Tractor is constructed it will explicitly call for Vehicle.__init__ and also for ConstructionMachine.__init__ , where both of these functions will explicitly call for Machine.__init__ since they should explicitly construct their own parent (which is Machine for both) hence Machine is constructed twice.
Clearly this is only one example out of many under this domain, and that’s why super() had been introduced. Since super() goes over the MRO and the MRO doesn’t contain duplicates, it solves the problem. However, for this solution to happen all classes must support the cooperative multi-inheritance paradigm. In a nutshell, all classes must call super().__init__() (once) from their __init__s instead of explicitly calling the __init__s of their parents, and additionally forward the constructor’s params by **kwargs. Not only for init, but for every function that is used via multiple-inheritance. This is not the whole idea but we’ll deal with that later.
It can be thought of as taking a parcel of all the arguments for all the classes-constructors along the inheritance graph, where each class takes what it needs from that parcel and forwards it to the following class according to the MRO. Ugly, but necessary — and thus not very adequate to python’s zen.
To summarize, in our bug example ConstructionMachine is aiming to be a cooperative class and that’s why it has the super() call in it’s __init__ although it doesn’t have a parent, which finally causes the bug.
Could this happen in real life?
Yes! I encountered it while working with PyQt. PyQt is a famous binding for QT framework, that enables GUI capabilities in python. Apparently, PyQt package complies with the cooperative multiple inheritance paradigm (see statement), but if one is not aware to this paradigm (e.g. doesn’t have super() calls in every __init__ of its own classes) and still uses PyQt — such bugs could raise.
When having both cooperative classes and non-cooperative classes work together collisions as this are occurring and such bugs are hard to identify if you’re not aware to these issues, as the program crashes and the trace-back is surprisingly pointing to an almost non-related code section.
Not only PyQt uses this. Every python package that follows this paradigm must declare it in its documentation.
How to follow the paradigm?
In this section I’d like to be very practical with the simplest guiding thumb-rules. There are other ways to accomplish the same results, but I wanted to provide the most straight-forward advice.
All classes that has some kind of inheritance relationship with one another should follow the cooperative paradigm. Let’s focus on the constructors of cooperative classes:
- Every __init__ should accept **kwargs in addition to its arguments. E.g. def __init__(self, arg_a: int, arg_b: int, **kwargs).
- Every call to a cooperative class constructor must be done with non-positional keyword-ed arguments. E.g. a=A(arg_a=14, arg_b=0) .
- Every __init__ should call to the super’s init, and exactly once. When doing so, it must pass the **kwargs onward, but also add the variables it used back into the **kwargs parcel. E.g. self._a=arg_a; kwargs[arg_a] = arg_a; super().__init__(**kwargs).
Now constructors are covered, and if there is a common argument name in multiple constructors it isn’t wrapped away. The keyworded call assures no positional conflicts and the single call for super guarantees that all classes along the MRO will be initialized.
The same rules can be used for all other overridden methods, even custom ones, but for that one might need to inherit from a custom “Root” class that wraps it around.
I strongly urge the reader to learn from the following article: super considered super — by Raymond Hettinger, which covers more edge cases such as how to wrap 3rd party libraries that are not cooperative to align with your cooperative classes, and more. Note that this article does not deal with common argument names, but the solution for this problem was described above.
Should this paradigm be embraced?
It may surprise you but the answer, in my opinion, is no. First, although this paradigm is important for multiple inheritance and is probably the only viable solution, it is not widely used. I gave an example of PyQt but I was having a hard time to find more famous python packages that follow it. Second, this makes the code a bit cumbersome in readability terms as every overriding function (especially every constructor) must have the **kwargs peeled and composed back, and every call to it cannot be done with positional arguments. Third, the whole idea of having to transfer all of the arguments for all the constructors that might be called later on the MRO (some of them without any real connection to the original call) is problematic in encapsulation terms and error prone.
With that, I find it utterly important to be aware of this paradigm. One can still use packages that follow the cooperative multiple inheritance paradigm with others that don’t, while avoiding the diamond-problem by designing the code that way, and just solve this potential problem as it’s being encountered. Or, at least suspect it when something weird happens, and then play with the inheritance declaration order (the tuple of a class’s __bases__) to see if a different MRO will solve such issues.