Sometimes you spend enough time with a tool that you think you know it well, until something smacks you out of your arrogant stupidity back into the realm of humility. A couple of days ago, someone approached me with a grin, showing me a piece of code that looked like this (simplified):
After I got over the exec(), the actual weird thing here dawned on me. You can put a for loop inside a class body? What the hell?
So I loaded up ipython and started exploring:
OK, interesting. “x1=1” really resulted in something like “Foo.x1 = 1”. (Also “i”, the loop variable, is in there — a relatively well known quirk). So it seems like any name definitions in this zone automatically get attached to the class, just like a regular class variable definition would. Wait a second — so are method and attribute declarations within a class not special syntax?
A class definition is an executable statement.
Which sounds normal and benign enough on its own until you realize what this really entails. Further down, the reference continues:
The class’s suite is then executed in a new execution frame (see Naming and binding), using a newly created local namespace and the original global namespace. (Usually, the suite contains mostly function definitions.)
Yeah, usually. *evil chortle*
When the class’s suite finishes execution, its execution frame is discarded but its local namespace is saved.  A class object is then created using the inheritance list for the base classes and the saved local namespace for the attribute dictionary. The class name is bound to this class object in the original local namespace.
So yes, really, class bodies aren’t a syntactically specialized thing that can only have variable and method declarations. In comparison to other languages, for example Java, this certainly is out of the ordinary. A class body is an execution context just like any other, almost like a function body. It’s just that at the end, all the local variables in that context become your class variables. You can put anything in there!!
Of course, my first instinct is to see just how silly you can get.
Pretty silly, it turns out. One thing to note in this example is that “cats” gets printed when the class is defined, but not when a new instance is created. This makes sense — the definitions of class variables and methods (along with any other statements) are executed while the class definition is being executed, all before __init__ ever gets run. Another experiment:
OK, that was kind of pointless. To do this exploration more systematically, one can actually dig into CPython’s Grammar file, which is pleasantly short and readable. I pulled out the relevant bits here:
classdef: 'class' NAME ['(' [arglist] ')'] ':' suitesuite: simple_stmt | NEWLINE INDENT stmt+ DEDENTsimple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINEsmall_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | import_stmt | global_stmt | nonlocal_stmt | assert_stmt)stmt: simple_stmt | compound_stmtcompound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
So what if …
Uhhhh, seems like it could end up being useful perhaps, but not exactly sure how, honestly, when we have “import foo as bar”. Or perhaps:
Gross, no one do that please. And yet another example:
Cool, that works. Come to think of it, I’ve seen that before — often for small enum classes and such that are only used with the larger class itself. Also, in stuff like Django ORM, with the “Meta” class.
Speaking of “Meta” classes, what about metaclasses (hardy har har)? If you don’t know what those are, the gist of it is simple: What if we wanted to automatically generate classes and their fields? Or what if we wanted to execute some code every time a subclass of our class is defined? (e.g. what Django ORM’s ModelBase does). Is there a way we hook into and customize the creation process of a class and do stuff in there? Turns out, yes.
The reason I bring this up is because that’s along the lines of what the original piece of code was trying to do — dynamically generate attributes in a class. You might want to take some time to read through Jake Vanderplas’s excellent explanation, or at least look at the examples at the end.
In comparison to all that, our bizarro for loop could have been a simple and readable alternative, if not for the stupid exec(). The reason it’s there is that it’s a hack that allows you to generate variable names. A more pythonic way to do this would be setattr(), which seems to fail us in this case:
So it seems that the class name is bound and accessible from within the class body during definition time, so no error about Foo not being defined. But somehow it’s not writeable, or perhaps that copy gets thrown away or something. It’s odd. (Update: this was my bad, see note at the end) This of course gives me another bad idea:
What the heck is __qualname__? I won’t go into it, but there’s a PEP about it if you care. Nothing else interesting in here.
Anyway, with anything more complex, stuffing logic within the body of the class seems like it’d probably get messy anyway. Of course, for generating classes, we’re definitely not out of options here, even without custom metaclasses:
So all in all, probably not worth it to use it for this purpose, in my humble opinion. So can anyone think of any other useful stuff to do with this “feature”? Does anyone know anything about the history of why it’s this way?
- If you really want to up the heart attack factor, you can literally load up an entire other python file into a class body, no joke. It should go without saying that you should probably never ever do this and I take no responsibility for the results.
- A long introduction to type(), objects and friends.
- CPython’s typeobject.c which is very long winded but there’s a lot of cool stuff in there, like where they got their algorithm for what actual function to call when you call a method on a class with a bunch of parents (MRO). And here is pypy’s equivalent (I think?).
- Hey I just met you, this is crazy. Here’s my number, so call_maybe().
- Alex Martelli’s book also has an in-depth explanation of python classes.
Update 1: Looks like my code with setattr() didn’t fail only because I already had defined a Foo class earlier on in the same session. If you start fresh, it errors as expected, since Foo isn’t bound until the end of the body:
Update 2: My friend Chris Post tipped me off that you can actually modify locals() thus, and since python uses the locals to populate the class variables, this works:
Though the CPython docs tell you to not do this:
The contents of this dictionary should not be modified; changes may not affect the values of local and free variables used by the interpreter.
I’d probably not mess with that.
Update 3: Months later, watching a video by Raymond Hettinger, I found an awesome clue as to why the design is this way. The money quotes start at 9:00:
My understanding is that he spent months building the language with modules and functions and got it working fairly well, but it wasn’t object oriented! He used up all his vacation then he’s like “Oooh, I need to add objects!”. I believe it was done in one day. What he did was he indented all of the defs. He took the functions he already had, and ran them in their own namespace. A class is a namespace. If you actually cut the code out of a class and stick it in a separate module and import it, it looks remarkably like the class dictionary. How many of you knew that what’s inside is effectively a module? Did you know I could put print statements in the class definition and when you define the class, it’ll print? You can put for loops in there. You can actually alongside the defs open and close files — unbelieveable! This is against the rules in Java and C++! It’s not Java and C++ — just sayin. The way classes work is that definitions run as if they’re in their own module and the module dictionary becomes the class dictionary. You learned something new now.
So there you have it folks!