ADVANCED PYTHON PROGRAMMING
Function Internals 2
This time, we keep exploring bytecode—how does it work, what are closures, and whether we can actually rewrite functions on the fly.
Last time, we saw that a function is actually an object, with a few special attributes that encapsulate its “function”-ness—namely, __code__
, which keeps the bytecode describing its algorithm, as well as all the necessary values and names. This time, we’ll take an even closer look, figure out where scopes fit in it all—and see how to modify a function’s code dynamically.
Storing Names
Let’s start with a simple piece of code:
>>> source_code = '''
x = 1
print(x)
'''
>>> code = compile(source_code, filename='', mode='exec')
Now, let’s disassemble it:
>>> import dis
>>> dis.disassemble(code)
2 0 LOAD_CONST 0 (1)
2 STORE_NAME 0 (x)3 4 LOAD_NAME 1 (print)
6 LOAD_NAME 0 (x)
8 CALL_FUNCTION 1
10 POP_TOP
12 LOAD_CONST 1 (None)
14 RETURN_VALUE
The surprising bit about this code is that to access the value of x
, and to access the built-in print
function, it uses the same LOAD_NAME
instruction; how does it know which scope to look in? Does it traverse the entire namespace hierarchy every time?
The truth is, I cheated—by writing a piece of code without any context, the assignment x = 1
actually did happen in the code’s only—and therefore, global—scope, right where print
is.
If we’d actually define a new function with a scope of its own:
>>> def f():
... x = 1
... print(x)
We’d see its code is a bit different:
>>> dis.disassemble(f.__code__)
2 0 LOAD_CONST 1 (1)
2 STORE_FAST 0 (x)3 4 LOAD_GLOBAL 0 (print)
6 LOAD_FAST 0 (x)
8 CALL_FUNCTION 1
10 POP_TOP
12 LOAD_CONST 0 (None)
14 RETURN_VALUE
Specifically, the two load instructions are different: to access the local x
, Python uses LOAD_FAST
; and to access the global print
, it uses LOAD_GLOBAL
. So it does optimize name lookup after all, whenever it compiles a function—and while the abstraction with the hierarchy of namespaces works well for understanding name resolution—reading the tiny letters is better still.
But Seriously, Who Cares
Except for being great fun, disassembling functions lets us reassemble them differently, and inject code dynamically—which is pretty crazy, and can lead to very powerful metaprogramming techniques. However, dealing with such primal forces is not easy, as you can see even from this simple example:
>>> def f():
... locals()['x'] = 1
... print(x)
Contrary to what you might expect, it does this:
>>> f()
Traceback (most recent call last):
...
NameError: name 'x' is not defined
Which is weird, because we’ve clearly placed x
in the locals
dictionary. Well—besides the minor detail that you can’t actually edit the locals
dictionary in some versions of Python—even if you could, there’d be a problem; and that problem would only become evident from the bytecode:
>>> dis.disassemble(f.__code__)
2 0 LOAD_CONST 1 (1)
2 LOAD_GLOBAL 0 (locals)
4 CALL_FUNCTION 0
6 LOAD_CONST 2 ('x')
8 STORE_SUBSCR3 10 LOAD_GLOBAL 1 (print)
12 LOAD_GLOBAL 2 (x)
14 CALL_FUNCTION 1
16 POP_TOP
18 LOAD_CONST 0 (None)
20 RETURN_VALUE
As you can see—after invoking the locals
function and storing 1
under its x
, Python actually goes ahead and prints the global x
. It’s a funny misunderstanding, really: it had no idea we were manually fiddling with its scopes, so it went ahead and optimized the lookup away, jumping to the conclusion that x
must be global.
Getting Some Closure
We’ve seen local names, and we’ve seen global names—but what about non-local ones? Let’s investigate:
>>> def f():
... x = 1
... def g():
... print(x)
... return g
>>> g = f()
>>> dis.disassemble(g.__code__)
4 0 LOAD_GLOBAL 0 (print)
2 LOAD_DEREF 0 (x)
4 CALL_FUNCTION 1
6 POP_TOP
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
So—it’s done with LOAD_DEREF
. What that means is, since we’re not supposed to look neither in the local namespace nor in the global one, we need to figure out where exactly we were defined, and fetch the right value from some in-between, limbo scope.
In truth, Python does it for us: when it creates a function, it keeps track of its own scope—so it can create nested functions that reference its scope, by adding “pointers” to this scope into these functions’ closures. This is a fancy word for “all those scopes surrounding the function”, which “close in” on the function and (hopefully) contain all the information it needs to run. You can see it here:
>>> g.__closure__
(<cell at 0x...: int object at 0x5639790d0700>,)
Again, a tuple, whose indices are referenced by LOAD_DEREF
; and in it, a cell—a level of indirection into a different scope, where there’s an int
object of value…
>>> g.__closure__[0].cell_contents
1
Exactly—1. So why the extra overhead of cells? Because we can’t bake in the value, of course; it may change. We’d have to resolve it dynamically, and then, instead of keeping track of and traversing the hierarchy of scopes, we’d jump straight where we need, and resolve it immediately. If Python figures it’s a local name, it’d become LOAD_FAST
, which goes straight to the local namespace; if it’s global, that’d be LOAD_GLOBAL
, which goes straight there; and if it’s neither, LOAD_DEREF
will go the the appropriate cell in the closure, providing a portal to that specific value, wherever it may be.
Fiat Function
So now that we know all that, the question is—what the hell can we do with knowledge so obscure. There are two schools of thoughts here; and the first one says, “absolutely nothing”. You should stay away from such… voodoo, from such black magic—it’s completely incomprehensible and unmaintainable to other people, and it can break in some cryptic way at the worst possible moment. Be that as it may, I’m of the second school. You see, growing up, I really liked fantasy—Dungeons and Dragons and the like. I’d always play the mage, throwing fireballs and thunderbolts and whatnot. And never in my life had I encountered a character that yielded magic powers, but shied away from using them because it was too dangerous. Sure, a mage wouldn’t light her cigarette with a fireball—but neither will she prefer to use the somewhat-more-boring-but-oh-so-reliable-longsword instead. Same goes for code: you should use your powers wisely—but this decision is orthogonal to getting more power.
My own fascination with function internals started when one day, I wrote a Python class:
class A:
def __init__(self):
self.x = 1
def f(self):
return self.x
And got annoyed with always having to include self
in the signature explicitly. Languages like C++ and Java had an implicit this
—yet my beautiful Python did not?
So I set out on a quest to inject this self
dynamically, calling unto the magic of decorators, descriptors, and even the Python tracer. But for the life of me, I couldn’t change the local variable self
to point to the instance—because, as I looked at the bytecode, I realized there was no local variable self
. When you write it this way:
class A:
def __init__():
self.x = 1
def f():
return self.x
Python assumes self
is a global variable, and ends up “jumping over” the local scope without even looking there.
So I had no choice but to rewrite the functions’ code! First, let’s find the index of the global name self
, and remove it from that tuple:
index = f.__code__.co_names.index('self')
new_names = tuple(n for n in f.__code__.co_names if n != 'self')
Then, let’s add self
to the end of the local names’ tuple, co_varnames
, and record its index, too:
new_varnames = f.__code__.co_varnames + ('self',)
new_index = new_varnames.index('self')
Finally, let’s add a piece of code that assigns self
to the instance—or, to keep things simpler, to the value 1
. In other words, we’d like this function:
def f():
print(self)
To print 1. Here goes:
bytecode = []
for instruction in dis.get_instructions(f.__code__):
if (
instruction.opname == 'LOAD_GLOBAL'
and instruction.arg == index
):
bytecode.append(dis.opmap['LOAD_FAST'])
bytecode.append(new_index)
else:
bytecode.append(instruction.opcode)
bytecode.append(instruction.arg or 0)
All we need to add is the assignment code:
LOAD_CONST 1
STORE_FAST self
Like so:
if 1 in f.__code__.co_consts:
const_index = f.__code__.co_consts.index(1)
new_consts = f.__code__.co_consts
else:
new_consts = f.__code__.co_consts + (1,)
const_index = new_consts.index(1)bytecode = [
dis.opmap['LOAD_CONST'], const_index,
dis.opmap['STORE_FAST'], new_index,
] + bytecode
And now, let’s bake it all into a function. Full disclaimer—__code__
has a lot of arguments, but we’re just going to copy most of them over from f
:
import types
new_code = types.CodeType(
f.__code__.co_argcount,
f.__code__.co_posonlyargcount,
f.__code__.co_kwonlyargcount,
f.__code__.co_nlocals + 1, # We added the local 'self'.
f.__code__.co_stacksize,
f.__code__.co_flags,
bytes(bytecode), # That's the new code.
new_consts, # That's the new constants.
new_names, # That's the new global names.
new_varnames, # That's the new local names.
f.__code__.co_filename,
f.__code__.co_name,
f.__code__.co_firstlineno,
f.__code__.co_lnotab,
f.__code__.co_freevars,
f.__code__.co_cellvars,
)
Recent versions of Python even added a replace
method, which clones an existing code object with just a few changes—so it seems I wasn’t the only one playing with this stuff. It’s a tad cleaner:
new_code = f.__code__.replace(
co_nlocals = f.__code__.co_nlocals + 1,
co_code = bytes(bytecode),
co_consts = new_consts,
co_names = new_names,
co_varnames = new_varnames,
)
And now, all we have to do is wrap it up in a function—again, preserving everything else as-is:
new_f = types.FunctionType(
new_code,
f.__globals__,
f.__name__,
f.__defaults__,
f.__closure__,
)
Et voilá:
>>> new_f()
1
Pretty cool, huh?
Injecting Code
Now let’s do something even cooler. Given this function,
def f():
print('before')
print('after')
And this function:
def g():
print('in the middle')
We’re going to inject g
right in the middle of f
—something even decorators can’t do. It’s actually much simpler than our first challenge; the only part missing for us to do it is determining which line we’re on, and how does bytecode even correlate to lines.
Without going into too much detail, __code__
has a co_firstlineno
attribute, which has the number of its first line (which would be the function signature); and co_lnotab
, which is a data structure mapping offsets in the bytecode to new lines. Luckily, we don’t need to work with something as low-level: dis
's get_instructions
returns an iterator of handy Instruction
objects, which have a starts_line
attribute for any instruction that is the first on its line. All we have to do, then, is this:
new_names = f.__code__.co_names + ('g',)
index = new_names.index('g')line = f.__code__.co_firstlineno + 2
for instruction in dis.get_instructions(f.__code__):
if instruction.starts_line == line:
bytecode.extend([
dis.opmap['LOAD_GLOBAL'], index,
dis.opmap['CALL_FUNCTION'], 0,
dis.opmap['POP_TOP'], 0,
instruction.opcode, instruction.arg or 0,
])
else:
bytecode.append(instruction.opcode)
bytecode.append(instruction.arg or 0)new_code = f.__code__.replace(
co_code = bytes(bytecode),
co_names = new_names,
)new_f = types.FunctionType(
new_code,
f.__globals__,
f.__name__,
f.__defaults__,
f.__closure__,
)
And there you have it:
>>> new_f()
before
in the middle
after
Conclusion
In Python, nothing is impossible. That’s technically true for any Turing-complete language (or, even technically-er, false for any Turing-complete language)—but the point is, Python is an amazingly cool language, and having that sort of power at your fingertips is pretty exhilarating. And that’s just functions! Next time, we’ll get into generators—an important foundation for the rather advanced topic of coroutines—and then on to classes, and all the wonderful deliciousness you can do with object-oriented code.
The Advanced Python Programming series includes the following articles: