ADVANCED PYTHON PROGRAMMING

Objects Incarnate

This time, we go over many more behaviors that objects have to offer, and see just how powerful Python can get.

Dan Gittik
13 min readApr 20, 2020

--

Last time, we saw that objects comprise of an ID, a type and a value—and that the type is by far the most interesting. We’ve covered display, equality and comparison, so it’s time for some more exciting tricks.

Truth or Dare

Any object can have a boolean value—its state being either “on” or “off”. To make this value easily accessible—for example, to use the object as-is in an if statement—you’d have to implement the __bool__ method (unfortunately named __nonzero__ in Python 2):

This is important not only because it’s neat, but because of an important design guideline, called the Principle of Least Astonishment. The idea behind it is that users expect their tools, which includes your code, to behave in a predictable way; and while meeting those expectations is good design, surprising the user is frowned upon. Consider this first-in, first-out (FIFO) queue:

The queue actually uses a list (which is a last-in, first-out [LIFO] stack), but pops items off of its other end, making it handy in some situations. But then:

By default, every object evaluates to True—and since we didn’t tell it any different, Python assumed that this is the case for our object, too. But containers such as lists, dictionaries, or, indeed, our Queue, are expected to evaluate to False when they’re empty, which led the user to make the mistake of reading from an empty queue. If only we’d…

Then the user would be least astonished:

Scott Meyers has a brilliant talk on the Most Important Principle of Design, where he phrases it thusly: “make interfaces that are easy to use correctly, and hard to use incorrectly”. While the first part is pretty straight-forward—I mean, of course anyone would try to make their interface nice and easy—the second one is very thought-provoking. Scott elaborates, and makes the revolutionary statement that users are not stupid: if they’re trying to get your code to work, they’re probably (somewhat) smart or capable, (somewhat) motivated, and are willing to read (some) documentation. Nobody goes to work thinking, “today, I’m going to do a terrible job”—so if they still mess it up, it’s your fault as much as theirs.

Don’t Call Us; We’ll Call You

Then there are callables—functions, first and foremost, but any object can be invoked if we so desire:

There’s not much to say about callable objects in terms of their callability—all the tricks we learnt for functions still apply. However, they’re particularly interesting when combined with decorators—and this time, we’ll start with 2nd order decorators first. Remember that brain spasm, when you first saw this:

Functions can get pretty difficult to read with all that nesting. Luckily, classes are much better at managing state while remaining flat, and they have two invocation points: their constructor (__init__), and their call (__call__):

This shouldn’t come as a surprise: remember the decoration line is an expression, so we might as well have defined double = Multiply(2) to be a callable object, and then invoked it to decorate inc.

Switching Babies

More interesting are 1st order decorators. In the previous case, our first invocation (Multiply(2)) constructed our decorator, which was then invoked on the function (__call__(f)) to produce a wrapper. In this case, our first invocation is going to be the decorating—so __init__ would have to accept a single argument, f. As you remember, the way decorators work:

Is actually syntactic sugar for:

So if we decorate something with a class, it’d invoke its constructor on it, and replace it with a new instance. That instance better be callable and delegate stuff to the original f—that’s what decorators are for, after all:

That looks like way more work than a standard 1st order decorator—and it is. But in replacing the function with a callable object, whose __call__ is effectively what we previously called wrapper, we’ve gained all the other benefits of objects: namely, attributes and methods. Remember the memoization decorator we developed for speeding up Fibonacci? How about:

This way, not only do we get memoization:

But we also get access to the cache, and can access it and clear it:

All that is possible because fib is not actually a function—it’s a Memoized object, whose f points to the original function, which does all the work when called. See for yourself:

As a side note, such decorator classes should still use functools.wraps to imitate the original function (e.g. by preserving its __name__ and __doc__). It looks a bit weird, but it works:

Arithmetics

OK, this part is boring—so I’m going to breeze through it. Let’s jump straight to an example:

Note that we’ve applied all our previous lessons: the __repr__ uses the dynamically resolved __class__, as does the __add__ when creating a new instance for the sum; and if the other argument is not an A, just we just admit that we don’t know with NotImplemented. There are quite a few similar operators:

Some things to note:

  • There’s no __div__; there used to be, in Python 2 ,but it proved too confusing. In Python, there’s only __truediv__, which does true division, like 5 / 2 == 2.5, and __floordiv__, which does integer division (rounding any fractions down), like 5 // 2 == 2.0.
  • There’s an operator for matrix multiplication, @. It’s the new kid on the block, and it’s kinda weird, but it’s there for extra syntax.
  • Some operators don’t have “infix notation” with a special symbol; they rather define the behavior of built-in functions like divmod and pow.
  • Specifically pow can take up to three arguments—the third one being a modulo, a mathy thing used to accelerate the computation under certain conditions. There’s no way to convey it with the infix x ** y, but your signature should support it nevertheless.
  • The __and__ and __or__ methods don’t actually correspond to the keywords and and or, but to the bitwise operations & and |.
  • Some operators are binary, working on both self and other, while others are unary, working only on self. The obvious ones are inversion (~x) and negation (-x), but emphasizing a number is positive (+x) or computing its absolute value with the built-in function abs are also a thing.

Then, there are the r-operators. You know how our proper __add__ implementation back there returned NotImplemented for unfamiliar types? In that case, Python will go ahead and ask the other party—but what method of the other party is it supposed to call? Addition may be commutative, meaning x + y and y + x are the same; but subtraction isn’t: x — y and y — x are usually quite different. The answer is, for __add__, Python will look for an __radd__, and for __sub__, it’d look for __rsub__. Again:

Then, there are the i-operators: +=, -= and the like. For immutable objects, don’t bother—it’s the same:

That’s the reason these operators weren’t mentioned when we were discussion scopes: they have nothing to do with assignment, binding names to values or resolving them; they’re just syntactic sugar for yet another kind of operation. Oh, and they don’t return anything: the i stands of “in-place”, and that’s how they should take effect—mutating their own instance.

There are a few more arithmetic operators I’ll cover for completeness sake:

  • To determine how your object behaves when it is cast to integer, float or complex, implement __int__, __float__ and __complex__.
  • To determine how your object behaves when it is rounded, implement __round__ (for round(x)), __floor__ (for math.floor(x)), __ceil__ (for math.ceil(x)) and __trunc__ (for math.trunc(x)).
    If you were wondering, like me, what’s the difference between floor and trunc—they both round down, but floor returns a float and trunc returns an integer. Hooray for time well spent!
  • Last and definitely least, __index__ determines how your object behaves if it’s being used to index a list, like so: items[x]. This is a crazy level of detail—and that’s how much Python enables you and empowers you to build incredible things on top of it.

Containing Oneself

This wasn’t easy—but it’s all part of Python’s data model. On to more exciting things: let’s talk about containers. You can do anything a list or a dictionary can:

This is especially interesting if you work with slices: you know, that funny notation of start:stop, or even start:stop:step. Let’s play with it:

These slices are built-in objects, which have a start, stop and step attributes—it’s up to you to decide what it means in your context. Moreover, Python indexing also supports tuples:

And tuples of slices:

And the weird ellipses object, which is actually valid Python syntax:

This is pretty extreme—but allows for all sorts of smart n-dimensional indexing when dealing with data science, machine learning and the like. In fact, if you’ve ever worked with numpy or something similar, you’re probably painfully familiar with this notation.

No container is complete without you being able to query its length, and whether it contains some item. Luckily, this is pretty easy:

And then there’s iteration—but before we get there, let’s pause for a moment to see how our newfound powers can be used for anything but a pedagogical anecdote.

Making More Pandas

Pandas is an insanely popular package for data analysis, and it works primarily with Data Frames—very versatile objects that let you express complex filtering and batch operations easily. Here’s an example:

df now represents a table with two columns, x and y, and 5 rows, where the i-th row x is set to i, and y to i squared. Now check that out:

In one fell swoop, we filtered out all the rows whose y column is greater than 5, resulting in only the last two rows. Like Richard Feynman said, “what I can’t explain, I don’t understand.”—so let’s implement it ourselves. First, let’s have a data frame class:

Now, obviously, this object should support indexing—but for two different scenarios. In the first scenario, like in df['y'], it needs to produce an object representing a filter, which can be narrowed down by e.g. filter > 5; in the second, like in df[filter], it needs to apply it, and return only the rows that match. Let’s begin:

What this filter is going to do is, upon comparison, return a list of booleans, indicating for each row whether for that key, its value was indeed greater or not.

So far, so good:

Now to the second scenario—when a data frame object gets a list of booleans, it should apply this filter by returning only those rows:

(Or, you can use the standard itertools.compress(self.data, key). I don’t know why I know that. Anyway—)

We did it! Of course, this is not exactly how Pandas work, but you got the point: Python really does support, with all its heart, building whatever you want on top of it.

The Last Iteration

Just one more magic method for now: __iter__. This is the method invoked by the built-in iter function, which is invoked by the for loop, and it should return an object that conforms to Iterator protocol.

This is a bit confusing, so follow closely—the object we’re iterating over is called iterable; and what its __iter__ should return is called an iterator, which has a __next__ method (ha! I tricked you), which is called repeatedly to produce results, until it raises a StopIteration. Historically, that’s how iteration was done: two separate classes, with the iterator one nested inside more often than not:

In this case, iterating over an instance of A produces all the integers from 1 to that instance’s x. It does so by returning an Iterator whose i is 0 at first; but every time its __next__ is called, that i is incremented and returned, until it exceeds that x and starts raising StopIteration instead.

That’s pretty simple—and incredibly tedious. You know what else responds to next and is impervious to iter?

That’s right—generators! If iter(a), i.e. a.__iter__(), has a yield statement, it automatically becomes a generator, resuming whenever next is called on it, until it can go no further and gasps out a StopIteration. Cleaner, no?

Conclusion

This was a long and arduous journey, but we’re now approaching the gates of Mordor: attribute access, method resolution, descriptors, properties and all that jazz. The next topic is one that, at least in my eyes, truly sets novice and advanced Python programmers apart—so buckle up!

The Advanced Python Programming series includes the following articles:

  1. A Value by Any Other Name
  2. To Be, or Not to Be
  3. Loopin’ Around
  4. Functions at Last
  5. To Functions, and Beyond!
  6. Function Internals 1
  7. Function Internals 2
  8. Next Generation
  9. Objects — Objects Everywhere
  10. Objects Incarnate
  11. Meddling with Primal Forces
  12. Descriptors Aplenty
  13. Death and Taxes
  14. Metaphysics
  15. The Ones that Got Away
  16. International Trade

--

--

Dan Gittik

Lecturer at Tel Aviv university. Having worked in Military Intelligence, Google and Magic Leap, I’m passionate about the intersection of theory and practice.