ADVANCED PYTHON PROGRAMMING

Death and Taxes

This time, we’ll see how objects can be used to manage contexts—and how exactly an object is born, lives, and dies.

Dan Gittik
11 min readApr 22, 2020

--

Last time, we talked about the awesome power of descriptors; now, getting back to objects, we still have a few behaviors to cover: namely, context management and creation. After that, I’ll add another article for the ones that got away—but in the meantime:

Bossy, but Sensitive

Let’s start with a background story for motivation, again. You probably did this in the past:

And that’s OK; the question is, whether you did it like so:

The reason being, fp occupies a certain resources—namely, a file descriptor, which keeps an open channel to that file—and should the read method fail and raise an exception, fp might not be properly closed and reclaimed, resulting in a resource leak. It’s even worse with multi-threading:

The critical section is that part in multi-threaded code that has to be mutually exclusive, so threads don’t work on the same resource all at once; and if it happens to raise an exception, we’ll leave the system locked, with all the other threads that want to enter the same critical section for a lock that we’ve acquired, but never released. It should’ve been:

So that no matter what happens, we’d have at least released the lock. This try...finally shtick works, but it’s pretty verbose—and it’d be nicer if this setup and cleanup logic could be encapsulated in the object itself.

Enter context managers. Python has quite a few by default—activating them is only a question of syntax. In case of reading a file, you could do:

Which would open a file, executed the indented block, sometimes called the managed context—and close the file at the end, whether we left this context naturally or because an exception was raised. Similarly:

This would acquire the lock, then enter the critical section, and finally release the lock, no matter how the critical section ended.

Do It Yourselves

It should come as no surprise that Python lets you define your own context managers, using the __enter__ and __exit__ methods; as you’d expect, the first happens when you enter the context, and the second when you exit it. Let’s take them for a spin:

Now we can do this:

Or even…

Note the ValueError in fact originated between the before and after—it was just propagated, leaving the context, and getting caught and printed only by the Python interpreter, after the context’s __exit__.

Now, there are a few things you need to know about context managers: first, whatever __enter__ returns is assigned to the as <name> part of the with statement, if it’s present:

It’s customary to return self, so that your context manager can be initialized in the with statement. I mean, if you do this:

You’re still going to get 1, not an A object—because the A object is created, its __enter__ is invoked, its return value is bound to a, and the actual object doesn’t have any name you could use. If, on the other hand, __enter__ would return self, that’d be OK:

Let’s talk about __exit__: so far I’ve just written it with the *args signature, but in fact it always receives exactly three arguments: the exception class, the exception object, and the traceback. If the context was left naturally, all three are None—but if an exception has occurred, this triplet describes it (albeit, a bit redundantly; the exception and traceback objects would’ve been enough). In some cases, you don’t care about these arguments—a file should be closed and a lock released whether an exception happened or not. But sometimes, like when you’re encapsulating a database transaction, you might want to commit it only if it’s been carried out fully—and roll back otherwise. Here’s some code, assuming a db object with a standard begin/commit/rollback functionality:

Another fun fact about __exit__: if its return value evaluates to True, any exception raised within the context is suppressed. Most __exit__s don’t return anything, which means they implicitly return None, or False, so exceptions are propagated; but watch this:

This comes in handy sometimes—for example, you know how you sometimes have an infinite loop wrapped up in a try...except to exit when the user pressed control+C? It’d look like this:

Which is clunky. How about:

The Suppress class is pretty straightforward; we can even support multiple exception classes with some star notation:

If the raised error is an instance of any of those exception classes, it’s suppressed—otherwise, it’s propagated:

Oh, and by the way—Python provides exactly that as contextlib.suppress.

Generator Magic

Let’s write a context manager of our own! How about something for benchmarking, to measure how long a block of code takes?

It’s actually pretty handy:

But it’s so cumbersome! Wish that we could just write a function:

Except, how do we split a function in the middle, so that it runs a while, yields control, and then resumes? Generators, of course—

We’d then wrap it up in a class, which would execute its first part on __enter__ and its second part on __exit__:

Let’s see that it works:

However, let’s make it even better—first, let’s apply it as a decorator to our generators, promoting them to context managers instantly; let’s make it so whatever’s yielded from the generator is bound to the as name; and let’s make it so if an exception was raised, the generator is resumed by rethrowing it inside it! This way, we can truly and wholly encapsulate the context manager experience in a simple function (well, a decorated generator):

Now we can do this:

And as it happens, Python provides this, too, by default—just import contextlib and use its contextmanager decorator on your generators.

The Birds and the Bees

One last thing we have to discuss about object is how they come into this world. Some people immediately think of __init__, and I’ve occasionally called it “the constructor” myself; but in actuality, as its name indicates, it’s an initializer—by the time it’s invoked, the object already exists, seeing as it’s passed in as self. The real constructor is a far less famous function: __new__.

The reason you might never hear about it or use it—is that allocation doesn’t mean that much in Python, which manages memory on its own. So if you do override new, it’d be just like your __init__—except you’ll have to call into Python to actually create the object, and then return that object at the end:

It works, but it’s pretty weird: we have to call super, which delegates the actual allocation to object—which kinda is the only one that can actually do it; then we have to do __init__’s usual assignment; and then we have to return it. Add to this the fact that __new__ is an implicit class method (so you don’t have to decorate it, but do have to pass cls to super’s __new__), and there’s really no reason to do this instead of just:

Unless, of course, you deliberately want to interfere with the object allocation. I mean, you could do this:

That would be great for trolling; but seriously now, sometimes it is useful. Let’s say we want to minimize our memory footprint, by reusing similar objects—especially if they’re immutable. Take a look:

This is the simple A class, from when we just started talking about objects. Its whole existence, its entire state, are defined by a single x; and even though we’ve implemented the equality operator, and our two instances are practically identical—they’re still two separate objects, taking twice as much memory as necessary. What if we cache these objects by their x, and if such an object is already allocated—reuse it instead?

Pretty cool, no? Incidentally, we can drop the __eq__, since every object is equal to itself. This practice is sometimes called an object pool, because we effectively create a pool of objects that we keep reusing again and again. It does come with a few gotchas, so let’s have another Q&A session.

1

Q: wouldn’t _cache grow indefinitely, preventing objects from being garbage collected and causing a memory leak?

A: Why yes, yes it would. The right way to do it would be to use weakref’s WeakValueDictionary, which is like a dictionary, only its value references don’t increase the reference count used by the garbage collector; it’s what called a weak reference, and when these values do get garbage collected, they simply disappear from this dictionary:

2

Q: wouldn’t it be better to use __new__ for pooling, but separate the initialization logic into the __init__?

A: In theory, that would be nice. However, the way Python works is that when you invoke a class—that is, upon instantiation—it first calls __new__, and then—no matter what it returns, as long as it’s an instance of that class, it’s passed to __init__. So if we pool objects, but still do something in __init__, this logic will be re-executed on the object every time it’s fetched from that pool (while the __new__ logic that put him in that pool will only happen once). See for yourself—imagine we’d try implementing a singleton, which is a class that only has one instance:

As far as its singularity goes, this singleton works:

However, __init__ gets rerun on this poor object time and again, potentially resetting its state:

So my advice: stick to either __init__ (in most cases), or __new__ (when you really need it); but don’t mix both.

Death

Every object has its day; although in Python, destructors are not nearly as popular as in other languages. I mean, you have the __del__ method:

But it’s not very reliable—primarily because it’s Python’s job to manage memory, and it makes no guarantees as to when exactly this object will be destroyed, and its destructor called. The fact that in our previous demo it works mean very little; in other cases, it might’ve happened a minute or an hour later, which is not ideal, especially if you’re one of those people who like their code to be deterministic. Instead, I suggest using context managers to deal with setup and teardown logics; and if you really, really have to, do this:

What this does, is keeps track of when exactly the object is garbage collected, and then invokes the print function on the argument ‘😵’. More generally, its signature is (obj, func, *args, **kwargs), and it’s pretty similar to the __del__ method; except it’s more explicit about this event happening asynchronously, some time in the future. Passing a callback feels less of a commitment than defining some actual behavior; but in any case, use context managers instead.

Conclusion

Objects are infinitely versatile—we can even have them managing contexts of nested code, and customize the way they’re created. The still have a few behaviors we haven’t covered, either because they’re pretty minor, or because I couldn’t figure out how to fit them organically in our narrative; but in any case, we’ll have a chapter for miscellanea later—next time, I want to keep this positive energy going and escalate to classes and metaclasses.

The Advanced Python Programming series includes the following articles:

  1. A Value by Any Other Name
  2. To Be, or Not to Be
  3. Loopin’ Around
  4. Functions at Last
  5. To Functions, and Beyond!
  6. Function Internals 1
  7. Function Internals 2
  8. Next Generation
  9. Objects — Objects Everywhere
  10. Objects Incarnate
  11. Meddling with Primal Forces
  12. Descriptors Aplenty
  13. Death and Taxes
  14. Metaphysics
  15. The Ones that Got Away
  16. International Trade

--

--

Dan Gittik

Lecturer at Tel Aviv university. Having worked in Military Intelligence, Google and Magic Leap, I’m passionate about the intersection of theory and practice.