The Magic Behind One-Line Expressions in Python
Introduction to how human-readable code executes as opcode.
The other day I was solving a coding problem Check If It Is a Good Array. After an hour of work, I submitted my solution to check the result. Though correct, the performance of my code was in the bottom 20% compared to the rest of the solutions submitted. To improve my problem solving, I read through the forum to learn how peers had approached the problem. The simplicity of one of the solutions blew me away! In the solution(example code 1), in line number 7 the author has used a one-line expression to good use to make code concise and readable. Though I have been coding in Python for more than four years now, the power of language when put to use-cases like this has me in awe.
The purpose of writing this article is to deep-dive and understand how one-line expressions work in python’s call-stack and introduce readers to how opcodes work underneath.
One line Expression’s Interpretation in Python
The snippet in the example code 2 uses one line expression in line 2. Assume we calculate the one-line expression in line 2 on paper; if we decide to calculate b+1 first, store the result and then calculate a+b, the output of line 3 would be 6, 12. If we choose to reverse the order, the output on line 5 would evaluate to 12, 11. Thankfully, python interpreter does not have this luxury.
In python, the right-hand expression is always evaluated first i.e. everything right of ‘=’ is evaluated first. When we use a comma(,) in an expression, we are asking python to create a tuple. Once calculated, the values evaluated on the right-hand side are unpacked into variables on the left-hand side.
Here, a question may arise in what order two expressions on right hand side of ‘=’ evaluated? In example code 2 & example code 3, how does python calculate values a+b & b+1 with original values 5 & 6. Shouldn’t it use the updated values once it calculates a or b? Let’s look at the opcodes to understand.
Disassemble and analyze opcode
First column — represents line number in the function. [Here code in example code 2.]
Second column — 0, 3, 6 .. are the address in the byte-code which corresponds to the byte index.
Third column — LOAD_FAST, LOAD_CONST .. represents the instruction name or op-name.
Fourth column — This is used internally by Python to manage the stack, jump to a specific instruction, fetch variables etc.
Fifth Column — This is human readable form of arguments.
Understanding the Opcodes
In the call-stack diagram, the first three lines are more or less self-explanatory. LOAD_FAST pushes variable ‘b’ into the stack, LOAD_CONS pushes constant 1 into the stack in that order and then BINARY_ADD removes content on the top of stack(TOS) which is 1 and content second to the top(TOS1) which is b=6, does a binary addition and stores the result to top of the stack. So the stack content after this operation is .
Then using LOAD_FAST ‘b’ & ‘a’ is pushed to the stack. Note, the stack content now is [7, 5, 6]. Another BINARY_ADD, adds contents of TOS & TOS1 and pushes the content to TOS, so stack looks like [7, 11]. ROT_TWO swaps the contents of TOS and TOS1, so stack looks like [11, 7]. The two STORE_FAST takes values in the stack and store them in the names on the left-hand side of the assignment in that order. So we end up storing a = 7 and b= 11. The opcodes for line number 3 just loads the values of a & b creates a tuple and returns the output.
I hope this helps introduce to how python opcode works to solve expressions. A similar approach can be used to disassemble and understand any piece of code written in python. A good follow-up could be to do a similar exercise on code in example code 3.