Python Objects Part IV: First-Class Everything
We’ve covered much information over the course of this series. We’ve learned how Python uses objects to represent values, how these objects can be compared, and how they are instantiated. We’ve learned how CPython saves memory and time by storing certain immutable objects as shared. And we’ve learned how certain string objects are interned as identifier strings.
In this article, the fourth and last of my Python Objects series, I want to take a step back. To close the circle on our examination of objects in Python, today I will focus on a new one, the implementation through which objects are represented, and the fundamental building block of object-oriented programming — classes.
FIRST THINGS FIRST, FOR THE LAST TIME — A REFRESHER
Let’s review the (long!) list of what we’ve learned up to this point:
- Everything in Python is an object, something that a variable can refer to.
- Objects are classified by their value, type, and identity (aka. memory address).
- The value of an immutable (unchangeable) object is tied to its identity — if the value changes, the object changes.
- The value of a mutable (changeable) object is not tied to its identity — identity is retained across changes made to the object.
- The CPython implementation of Python pre-allocates shared values, certain ranges of commonly-used immutable types.
- To alleviate memory that can be quickly consumed by strings, CPython also implements string interning (aka. string storage) for identifier strings.
- When Python is instructed to instantiate a new immutable object, it first checks to see if an identical object already exists as a shared or interned object.
Also, note that the behavior discussed in this article is specific to CPython versions 3.0 and above. You are not guaranteed the same behavior on different implementations or versions of Python.
OK, WHAT ARE OBJECTS ACTUALLY?
Throughout the first three parts of this series we’ve used numerous different built-in methods and operators to examine object behavior in Python. For instance, we’ve used type()
to get the type of an object, is
to compare the identities of two objects, and append()
to alter the values (but not identities 😉) of lists.
Why do these operations work? Where is all this information about a Python object stored? In other words — how does Python know what an object is in the first place?
It turns out that the answer was lying in plain sight from the beginning. Remember way back in part one of this series, when I exemplified the type()
method?
>>> x = 1
>>> type(x)
<class 'int'>
At the time, I asked you to ignore the class
qualifier on the int
type output. Yet, I also promised that I’d return to the topic of classes later. Well, I’m a man of my word.
EVERYTHING IS AN OBJECT IS A CLASS
All objects in Python are represented by classes. Classes are objects themselves, but callable ones used to create instances of the objects we’re familiar with. In this way, all objects are instances of classes.
The concept of a class is analogous to the implementation of structures in low-level languages such as C. In fact, in CPython, this is more than just an analogy — structures are directly used to implement classes. Taking a look at CPython’s source code, you’ll see that all objects in Python are pointers to the struct PyObject
:
All the object types we’ve examined over the course of this series are instances of structures, or classes, that derive from this PyObject
class — they are referenced by the pointer ob_type
. Back in Python, we can use the type()
method to prove that even this most basic, general representation of objects is nothing more than a class:
>>> type(object)
<class 'object'>
This is the idea of first-class everything in Python, and the core implementation behind the language’s dynamic-typing. To quote the creator of Python himself, Guido Van Rossum:
One of my goals for Python was to make it so that all objects were “first class.” By this, I meant that I wanted all objects that could be named in the language (e.g., integers, strings, functions, classes, modules, methods, etc.) to have equal status. That is, they can be assigned to variables, placed in lists, stored in dictionaries, passed as arguments, and so forth. — Guido Van Rossum, “The History of Python,” February 27, 2009.
So, the next time you initially sign a Python variable to one type, and then change it to another within the same program, thank Guido, and classes!
CLASS ATTRIBUTES
All the information about an object, including its value and methods, are referred to as the attributes of a class. Attributes are the bread and butter of classes — when interacting with a given object, we are truly interacting with its class attributes.
You can get a list of all the attributes of an object using the built-in function dir()
:
>>> x = 1
>>> dir(x)
['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']
I know, that’s a whole lot of attributes I just threw at you. Don’t be overwhelmed, however; let’s break the above down, step-by-step.
Attributes can be classified into three categories:
- Private: Any attribute beginning with a double underscore (
__
) is private. Private attributes are not meant to be used beyond the definition of a class and are kept invisible from users. - Protected. Attributes beginning with just a single underscore (
_
) are protected. They are not kept entirely invisible, but can only be used within an inherited subclass (I’ll return to inheritance and subclasses later). - Public: If an attribute does not begin with any underscores, it is public, freely viewable and usable by the user.
In the int
object example above, the last eight attributes, those without any leading underscores, are public. They all represent unique methods and values provided with every instance of an int
. For example, did you know that all integers in Python feature corresponding numerators and denominators?
>>> x = 5
>>> x.numerator
5
>>> x.denominator
1
Each built-in type in Python features its own unique public attributes.
DUNDER ATTRIBUTES
So, if the last eight attributes, which have no leading underscores, are public, then all the preceding ones, which have double leading underscores, must be private, right? Well, not exactly. The long list of attributes with both leading and double underscores are not exactly equivalent to private attributes. Instead, they are their own kind — dunder attributes.
All classes, and, in turn, all objects, feature dunder attributes, special attributes represented by double underscores on both sides. I’ll get the Dunder Mifflin reference out of the way now.
Dunder attributes are pre-defined, accessible methods and values used to instantiate and describe different information for a particular object. Dunder methods correspond to different built-in methods and are called implicitly by the Python interpreter.
Since everything is an object is a class in Python, every built-in operation we take for granted is actually a call to given class’ dunder attribute. For example, built-in mathematical operations used on Python integers call those integers’ dunder methods. Multiplication uses the method __mul__
; hence why the following…
>>> x = 1
>>> x * 2
2
… is equivalent to calling x
's __mul__
method on the integer 2
.
>>> x.__mul__(2)
2
The same goes for operations that act on an object’s information attributes. Remember the type()
function? By calling it, we’re really accessing that object’s __class__
dunder.
>>> x = 1
>>> type(x)
<class 'int'>
>>> x.__name__
<class 'int'>
Our good ol’ reliable value comparison operator ==
? Same.
>>> x = 1
>>> y = 2
>>> x == y
False
>>> x.__eq__(y)
False
Why, even the dir()
method we’re using to view dunder methods is a dunder method itself, of the __builtins__
object:
>>> dir(x) == __builtins__.dir(x)
True
Dunder methods are the tools behind the abstraction that is Python object operations.
CUSTOM CLASSES — A SMALL INTRODUCTION
Up to this point in this series we’ve exclusively examined built-in object types. Yet, with our newfound understanding of classes, we can begin to create our our types. Custom classes can be defined using the keyword class
. Let’s start building our own, one-of-a-kind object, CustomClass
:
>>> class CustomClass:
... pass
>>>
Right now, CustomClass
is empty (note that pass
is the bare minimum that must be included to define a new class.) Nevertheless, we’ve successfully created a brand new object type:
>>> cc = CustomClass()
>>> type(c)
<class 'CustomClass'>
>>> hex(id(c))
'0x7fbe617212e8'
Did you notice that to create an instance of our class, I called CustomClass
like a function, with parentheses? This is necessary because classes are only callable objects used to instantiate new objects.
Believe it or not, we’ve actually been doing this the whole time. When we create a new object, whether built-in or custom, we really call that objects’ __init__
dunder method, a constructor that is implicitly called every time an object is instantiated. You know how Python permits “casting” of values with parentheses?
>>> x = str(1)
>>> x
'1'
The truth is, “casting” isn’t the most accurate term to describe this behavior. More accurately, when we “cast” a value as a particular type, we call an instantiation of a completely new type with the given value.
To begin fleshing out our new CustomClass
, we need to override our classes’ __init__
method. We can do so by defining an __init__
function directly.
>>> class CustomClass:
...
... def __init__(self):
... self.string = "This is a custom class!"
>>>
A few things here. First, we defined our __init__
constructor with one argument — self
. All instance methods of a class must be defined with this first argument, a variable which refers to the particular working instance of a class. This argument is not passed when we call the method, but must explicitly be defined so that a method call knows which instance of a class it is working on.
The keyword self
is actually arbitrary — we could name that argument dog
and it would still work — but self
is a widely-used standard. Don’t break from it.
Second, the __init__
method set a variable, self.string
. This represents the creation of an attribute. Dot notation is used to access attributes of an instance — by specifying self.string
, we specifically assign a public attribute string
that is exclusive to the particular working instance of CustomClass
, self
.
Now, every instance of CustomClass
will include an attribute, string
, which we can access using that above-mentioned dot notation.
>>> cc = CustomClass()
>>> cc.string
'This is a custom class!'
__dict__
Now that we have a basic working new object type, let’s see what it looks like under the hood.
>>> dir(cc)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'string']
Using the dir()
method, we can see our public attribute, string
, listed at the end. We also see that our CustomClass
comes predefined quite similarly to built-in types, with all the same dunder attributes. This makes sense — all objects derive from the same class format, after all.
Yet, the above list of dunder attributes is not exactly the same. Allow me to draw you attention to a new one, __dict__
. The __dict__
attribute is unique to custom classes. It is a dictionary used to store all arbitrary attributes of a user-defined class. For instance, if we print the __dict__
of CustomClass
, we’ll find our attribute string
:
>>> cc.__dict__
{'string': 'This is a custom class!'}
As attributes are added to or removed from an instance of a custom class, they are added and removed from that instances’ __dict__
. To prove this, let’s add a new attribute to our class, count
. Once we have an instance of a class, we can easily do this with dot notation:
>>> cc.count = 1
>>> cc.__dict__
{'string': 'This is a custom class!', 'count': 1}
And just like that, we’ve added a new attribute! The __dict__
attribute enables the malleability of custom classes. Note that we cannot use the same method of attribute-adding for built-in types, since they do not feature __dict__
attributes.
>>> x = 5
>>> x.count = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'count'
NAMESPACES
In our current working instance of CustomClass
we’ve defined two public attributes. One, string
was set when the object cc
was instantiated. The second, count
was added to the instance cc
after-the-fact. In both cases, the attribute was added to the __dict__
of cc
.
Now, let’s introduce a third method of defining attributes. We’re going to add a new variable, count
, without referencing any instance of CustomClass
.
>>> class CustomClass:
...
... count = 0
...
... def __init__(self):
... self.string = "This is a custom class!"
>>>
Let’s instantiate a new cc
with our updated class and check it’s __dict__
.
>>> cc = CustomClass()
>>> cc.__dict__
{'string': 'This is a custom class!'}
Wait, what happened to our new attribute count
? Did it not register?
I promise it did, just not in the same way as our original attribute string
. It turns out that the two actually represent different kinds of attributes of two different namespaces— class and instance.
Instance attributes are specific to a particular instance of a class. Remember how we initialized string
as an attribute of self
? In doing so, we made it an instance attribute. While every initial instantiation of a CustomClass
will feature equivalent definitions of the attribute string
, any changes made to string
will not register beyond the scope of that particular instance.
>>> cc1 = CustomClass()
>>> cc2 = CustomClass()
>>> cc1.string
"This is a custom class!"
>>> cc2.string
"This is a custom class!">>> cc1.string = "This is the first instance of a custom class!"
>>> cc1.string
"This is the first instance of a custom class!"
>>> cc2.string
"This is a custom class!"
In contrast, class attributes are not exclusive to any particular instance of a class. Rather, they are global and shared across all instances. By defining the attribute count
without referring to an instance self
, we designate it as a class attribute , which we can access using dot notation directly on the class type itself.
>>> cc1 = CustomClass()
>>> cc2 = CustomClass()
>>> cc1.count
0
>>> cc2.count
0
>>> CustomClass.count = 1
>>> cc1.count
1
>>> cc2.count
1
Instance and class attributes represent, and are represented in, two separate namespaces. The former is stored in a particular instance’s personal namespace. The second resides in a class-level namespace shared by all instances of that class type.
There are actually multiple __dict__
special attributes used to keep track of these separate namespaces. The __dict__
exemplified in the previous section was an example of an instance namespace. The dictionary cc.__dict__
represented a namespace of attributes exclusively owned by cc
, hence why it did not include the class attribute count
:
>>> cc = CustomClass()
>>> cc.__dict__
{'string': 'This is a custom class!'}
Rather, count
was stored in the class-level namespace, a separate dictionary attribute represented by CustomClass.__dict__
>>> CustomClass.__dict__
mappingproxy({'__dict__': <attribute '__dict__' of 'CustomClass' objects>, '__weakref__': <attribute '__weakref__' of 'CustomClass' objects>, 'count': 1, '__doc__': None, '__init__': <function CustomClass.__init__ at 0x7fbe61de2730>, '__module__': '__main__'})
Here we see that the class-level __dict__
namespace of a class additionally contains references to the __init__
constructor, documentation __doc__
, and more.
INSTANCE VS CLASS ATTRIBUTES — WHEN AND WHICH?
Alright, so instance attributes are specific to instances of a class, while class attributes are globally shared across all instances. How would we use these? When would we use one over the other?
Whether you define an attribute in the class or instance namespace may seem arbitrary, but the two actually serve distinct purposes. Instance attributes are crucial to the generalization of a class. For example, say you were defining a class Student
that represented a student with their first name.
Each instance of class represents a separate student; accordingly, each instance of Student
features a different name. Thus, the name variable must be an instance attribute.
>>> class Student:
...
... def __init__(self, name):
... self.name = name
>>>
Here, we define an instantiation of a Student
to take a parameter, name
, which is assigned as an instance attribute. This way, we can instantiate different Students
, each with unique names.
>>> student_1 = Student("Ben")
>>> student_2 = Student("Chelsea")
>>> student1.name
'Ben'
>>> student_2.name
'Chelsea'
If we were to implement name
as a class attribute, every time we instantiated a new student, that student’s name would overwrite all existing instances of Student
.
Now, say that in addition to storing a student’s first name, you also wanted to track how many students exist in total. To achieve this, you’d need a single variable with access to every instance of a Student
. In this case, you’d need a class attribute.
>>> class Student:
...
... num_students = 0
...
... def __init_(self, name):
... self.name = name
... Student.num_students += 1
>>>
Here, we define a class attribute num_students
that is incremented each time a new Student
is initialized. This way, we create a global variable that can easily keep count of the total number of students.
>>> student_1 = Student("Ben")
>>> Student.num_students
1
>>> student_2 = Student("Chelsea")
>>> Student.num_students
2
>>> student_3 = Student("Alexis")
>>> Student.num_students
3
An instance attribute would not be capable of this behavior.
NAMESPACES AND CLASS METHODS
Thus far we’ve worked solely with attribute variables. Class attributes can also be methods, however. I know I’ve been mixing the terms method and function quite a bit, so to clarify — a method is a function bound to a particular object.
For example, going back to our CustomClass
, we could define a public method print_custom_class_string
that prints the string
attribute of an instance of CustomClass
(note that I’ve updated the constructor to increment our count
class attribute):
>>> class CustomClass:
...
... count = 0
...
... def __init__(self):
... self.string = "This is a custom class!"
... CustomClass.count += 1
...
... def print_custom_class_string(self):
... print("String attribute: ", self.string)
>>>
>>> cc = CustomClass()
>>> cc.print_custom_class_string()
'String attribute: This is a custom class!'
The method print_custom_class_string
is an instance attribute — it receives an instance of CustomClass
, self
and behaves according to that instance. As an instance method, this first parameter self
is mandatory — Python will give you an error otherwise.
Now, what if we wanted to define an attribute that printed the class attribute count
? We could do so with an instance method, but this isn’t necessary — class attributes are not tied to instances. The more appropriate, or Pythonic, way to do so would be to define a class method.
>>> class CustomClass:
...
... count = 0
...
... def __init__(self):
... self.string = "This is a custom class!"
...
... def print_custom_class_string(self):
... print("String attribute: ", self.string)
...
... @classmethod
... def print_custom_class_count(cls):
... print("Custom Class Count: ", cls.count)
...
>>>
>>> CustomClass.print_custom_class_count()
'Custom Class Count: 1'
Note the new syntax — @classmethod
. We use the @
character to implement decorators in user-defined Python classes. Decorators are, well, they’re classes themselves, but they’re special built-in classes that wrap methods. In wrapping print_custom_class_count
as a class method, we define the function as a class attribute.
Just like the class attribute count
, as a class method, the function print_custom_class_count
is not tied to any particular instance of a CustomClass
. Accordingly, it does not need to receive a self
instance — instead, it only needs to know the class type it is referring to. This is achieved by passing cls
as the first argument.
Just like self
, this first argument referring to a class type is necessary for defining class methods. And just like self
, the actual keyword cls
is arbitrary. But just like self
, the use of cls
is a widely-used standard — don’t stray from it.
ONE MORE — STATIC METHODS
So, the learning onslaught continues. We’ve learned how to define attributes and methods at the instance-level namespace, which are specific to particular instances. We’ve also learned about class-level namespace attributes and methods, which are not tied to any instance of a class. But, what if we wanted to define a class method with behavior having nothing to do with that class?
Such a method is a static method.
>>> class CustomClass:
...
... count = 0
...
... def __init__(self):
... self.string = "This is a custom class!"
...
... def print_custom_class_string(self):
... print("String attribute: ", self.string)
...
... @classmethod
... def print_custom_class_count(cls):
... print("Custom Class Count: ", cls.count)
...
... @staticmethod
... def print_message():
... print("You are printing this from a CustomClass.")
...
>>>
>>> CustomClass.print_message()
'You are printing this from a CustomClass.'
While defined within and accessed through a class, the behavior of static methods is not tied whatsoever to a class. For instance, in the above example, the static method print_message
strictly prints a string literal. This string literal does not rely on class or instance attributes of CustomClass
; it merely prints a constant message. Thus, the function needs no arguments, and does not even need an instance of CustomClass
to exist to be called.
When would you want to implement a static method, you ask? I admit that in this example, the static method print_message
is a bit superfluous — we do not need to define a class to print a string literal. Yet, say you were defining a class Dog
. The class represents a dog with its name, age, and weight. In this class, you implement a public method dog_age_to_human_age
that prints a dog’s age in human years.
Converting a dog’s age to human years is no one-liner. To make this function as clean as possible, you could have the public method dog_age_to_human_age
call another, separate function that calculated the dog’s age in human years. This separate function would not rely on any instance of Dog
; rather, all it needs are the integers representing the dog’s age and weight. Such a function would be useful as a static method.
>>> class Dog:
...
... def __init__(self, name, age, weight):
... self.name = name
... self.age = age
... self.weight = weight
...
... @staticmethod
... def convert_age(age, weight):
... small_dogs = {1:15, 2:24, 3:28, 4:32, 5:36, 6:40, 7:44, ... 8:48, 9:52, 10:56, 11:60, 12:64, 13:68,
... 14:72, 15: 76, 16:80}
... med_dogs = {1:15, 2:24, 3:28, 4:32, 5:36, 6:42, 7:47, ... 8:51, 9:56, 10:60, 11:65, 12:69, 13:74,
... 14:78, 15:83, 16:87}
... large_dogs = {1:15, 2:24, 3:28, 4:32, 5:36, 6:45, 7:50, ... 8:55, 9:61, 10:66, 11:72, 12:77, 13:82,
... 14:88, 15:93, 16:120}
... if weight <= 20:
... return small_dogs[age]
... if weight <= 50:
... return med_dogs[age]
... return large_dogs[age]
...
... def dog_age_to_human_age(self):
... human_age = self.convert_age(self.age, self.weight)
... print("Your dog {}'s age in human years is {}"
... .format(self.name, human_age))
As you can see, static methods make this class much cleaner to implement!
When going about deciding whether to implement a function as an instance, class, or static method, think about the variables that the function works with. The functionality lines up one-to-one — if a method works with instance attributes, it needs to be implemented as an instance method, with class attributes, as a class method, and with neither, as a static method. Combine all three and you must be on your way to a versatile custom class!
VERY BRIEFLY — INHERITANCE
By now you know I love partially misleading you. So what if I told you that objects aren’t quite instances of a class, but instances of instances of a class?
>>> x = 1
>>> isinstance(x, int)
True
>>> isinstance(x, object)
True
The builtin function isinstance
returns True
if an object is an instance of a particular class. As evidenced above, the number one is simultaneously an instance of the classes int
and object
. How could this be possible?
Remember how everything in Python is an object is an instance class? Well, by this same logic, an object can be an instance of an instance of class. Or an instance of an instance of an instance of a class. Or an instance of an instance of an… I’ll stop here. This concept is more than just valid logically — it is a key implementation of Python classes that is referred to as inheritance.
A class can inherit from an existing class. In such a case, the inherited class is referred to as a child, or subclass, while the class that was inherited from is referred to as the parent, or super-class. Recall that all Python objects in CPython derive from the struct PyObject
. This itself is an implementation of inheritance— all builtin types are actually subclasses of the parent type, object
.
Building on our working example, we can define a new subclass of CustomClass
, CustomSubClass
, using parentheses:
>>> class CustomSubClass(CustomClass):
...
... def __init__(self):
... CustomClass.__init__()
... self.string2 = "This is a custom subclass!"
>>>
The subclass CustomSubClass
has access to all the attributes of CustomClass
as well as a new attribute exclusive to itself, string2
. Anything we can do with CustomClass
, we can do with CustomSubClass
, plus more.
>>> csc = CustomSubClass()
>>> csc.string2
'This is a custom subclass!'
>>> csc.print_custom_class_string
'String Attribute: This is a custom class!'
When accessing an attribute from a subclass, Python first checks if that attribute exists in the subclass. If so, that subclass attribute is used. Otherwise, Python proceeds to search for the attribute in the parent of the subclass.
This is a simple one-level example, but classes can inherit at multiple levels. Such an implementation is referred to as multiple inheritance, and is a powerful tool to developing versatile object-oriented programs.
This wraps up my series on Python objects. After these four articles, I hope that you feel confident with a strong understanding of how Python uses objects to represent values. Objects, and more specifically, classes, are foundational not only to how Python works as an object-oriented language, but to how you can take full advantage of its versatility and power.
Of course, there is still much to learn when it comes to object-oriented programming in Python . This series was only intended to provide the background knowledge necessary to learning object-oriented programming, and this article was only intended to give a basic introduction to such object-oriented concepts as attributes, decorators, and inheritance.
Nonetheless, with your newfound knowledge of objects, you are now fully prepared to become an object-oriented Python master!
TL; DR:
- All Python objects are instances of classes, callable objects used to create the objects we’re familiar with.
- The information and methods defined with objects are referred to as class attributes — these attributes can be listed using the
dir()
method. - Instance attributes are owned by a particular instance of a class — they are inaccessible to other instances. Instance attributes are stored at the instance-level namespace in a given instance’
__dict__
. - Class attributes are globally shared by all instances of a given class — any changes made to a class attribute in one instance register in all other instances. They are stored in the class-level namespace
__dict__
. - Methods can be defined as an instance, class, or static method using decorators.
- Classes can inherit from other classes.