A Python Tutorial To Understanding Scopes and Closures.
Most documentations about programming scopes and closures almost always seem to bend towards front-end development with JavaScript. I try to provide a more general information as much as possible on the subject whilst sticking to the eccentricities of python.
Scope has to do with the access of variables. As defined here, it is the set of rules that determines where and how a variable (identifier) can be looked-up. This look-up may be for the purposes of assigning to the variable or it may be for the purposes of retrieving its value.
To understand what scopes really are, let’s examine what happens when you write a program.
LEXING / TOKENIZING.
Before a program is parsed and executed, the string of characters that make up the program — code, is broken up into meaningful (programming language-specific) chunks, called tokens. considering this python code for instance:
b = 6
def f1(a):
print(a)
print(b)
Before this code is executed, it is tokenized into meaningful python semantics before being parsed, generated into bytecodes and executed by the CPython compiler. Below is the disassembled byte-code representation of the python function f1 above.
from dis import dis
dis(f1)
2 0 LOAD_GLOBAL 0 (print) *1
2 LOAD_FAST 0 (a) *2
4 CALL_FUNCTION 1 (1 positional, 0 keyword) *3
6 POP_TOP3 8 LOAD_GLOBAL 0 (print)
10 LOAD_GLOBAL 1 (b) *4
12 CALL_FUNCTION 1 (1 positional, 0 keyword)
14 POP_TOP 16 LOAD_CONST 0 (None) *5
18 RETURN_VALUE
*1. Load global name print.
*2. Load local name a.
*3. Call print function with 1 positional argument.
*4. Load global name b.
*5. Load constant, in which case there None.
The idea/concept of lexing provides the foundation to understand what lexical scope is and where the name comes from. consider this code that defines a function inside f1.
def f1(a):
print(a)
print(b)
def f2():
c = a + b
return c * 3
return f2()
now examine the disassembled bytecodes.
2 0 LOAD_GLOBAL 0 (print)
2 LOAD_DEREF 0 (a)
4 CALL_FUNCTION 1
6 POP_TOP3 8 LOAD_GLOBAL 0 (print)
10 LOAD_GLOBAL 1 (b)
12 CALL_FUNCTION 1
14 POP_TOP4 16 LOAD_CLOSURE 0 (a)
18 BUILD_TUPLE 1
20 LOAD_CONST 1 (<code object f2 at 0x10d966930, file "<ipython-input-7-e2aa7fecf82d>", line 4>)
22 LOAD_CONST 2 ('f1.<locals>.f2')
24 MAKE_FUNCTION 8
26 STORE_FAST 1 (f2)7 28 LOAD_FAST 1 (f2)
30 CALL_FUNCTION 0
32 RETURN_VALUE
The function f1
has one variable, a
and f2
also has one variable, c
. Considering each function as a block on it’s own, during lexing, every variable within each block is persisted to that very block. Now, the block created by each function is called ascope.
This gives the idea of lexical scope.
Lexical scoping (sometimes known as static scoping ) is a convention used with many programming languages that sets the scope (range of functionality) of a variable so that it may only be called (referenced) from within the block of code in which it is defined.
Functions are the most common unit of scope, each function you declare creates a scope for itself. Whilst functions might be the basic unit of scope declaration, there are other blocks of code that define scopes. An example of such others are control flow and loop blocks.
You can think of scopes as containers. Containers that are defined by where the blocks of the container is written. There is an outer container where all containers/blocks of code are written, ie: where the b
variable and f1()
are declared. This container is called the global scope.
It is possible for a container to be nested within another container, with the outer one being the parent of the inner one as illustrated above with the f2()
being written inside the f1()
.
Variables and functions that are declared inside another function are essentially “hidden” from any of the parent “scopes” whilst variables within a parent scope is accessible within the inner scope, just as b
is accessible from within f1()
and a
is accessible from within f2()
. In compliance to the above, variable c
inside f2()
is “hidden” from f1()
and the global scope as is a
and f2()
inside f1()
.
VARIABLE LOOK-UPS.
when a variable is referenced, as b
was in f2()
, the compiler first starts it’s lookup within the innermost scope, the scope of the f2()
. It won't find b
there, so it goes one level up, out to the next nearest scope, the scope of f1()
. It won’t find b
there, so it goes one more level up to the next scope, which is the global scope, where it finds b
and retrieves the value assigned to b.
Scope look-up stops once it finds the first match. The same identifier name can be specified at multiple layers of nested scope, which is called “shadowing” (the inner identifier “shadows” the outer identifier). Regardless of shadowing, scope look-up always starts at the innermost scope being executed at the time, and works its way outward/upward until the first match, and stops.
EXPLAINING SECOND DISASSEMBLED CODE AND CLOSURES.
In f2()
, there is the presence of the variable a
that was originally defined in f1().
That is the major idea behind closures: accessing variables that are defined outside the current scope. To fully understand closure, examine this refactored version of the second example.
b = 6def f1(a):
print(a)
print(b) def f2():
c = a + b
return c * 3 return f2 // return an unexecuted version of f2f2 = f1(10) // prints 10 and 6
c = f2()
print(c) // prints 48
In the example illustrated above, f2()
was executed well outside the lexical scope of f1()
but it still had access to the variable a
that was defined in f1().
Considering the fact that memory is automatically garbage-collected in python, one would think that after f1()
was executed, the variable a
would be automatically garbage-collected. But surprisingly it wasn’t. why is that?
This is because, the inner scope of f1()
was still in use by f2()
. By virtue of where it was declared, f2()
has a lexical scope closure over the inner scope of f1()
, which keeps that scope alive for f2()
to reference at any later time.
f2()
still has a reference to that scope, and that reference is called closure.
So, a few microseconds later, when the variable f2
is invoked (invoking the inner function also labeled f2
), it duly has access to author-time lexical scope, so it can access the variable a
just as we'd expect.
The function is being invoked well outside of its author-time lexical scope. Closure lets the function continue to access the lexical scope it was defined in at author-time.
Whatever facility we use to transport an inner function outside of its lexical scope, it will maintain a scope reference to where it was originally declared, and wherever we execute it, that closure will be exercised. Essentially whenever and wherever you treat functions (which access their own respective lexical scopes) as first-class values and pass them around, you are likely to see those functions exercising closure.
… python is very different in a way:
When accessing a variable outside it’s scope, you cannot reference that variable and then later reassign it in the same function — variables are not hoisted. It throws an error.
a = 5
def function():
print(a)
a = 10function() // raises the error below.
UnboundLocalError. Traceback (most recent call last)
<ipython-input-6-2fcbbbc1fe81> in <module>()
----> 1 function()
<ipython-input-5-8e223d9813d8> in function()
1 a = 5
2 def function():
----> 3 print(a)
4 a = 10
5
UnboundLocalError: local variable 'a' referenced before assignment
When Python compiles the body of the function, it decides that a
is a local variable because it is assigned within the function. The generated bytecode reflects this decision and will try to fetch a
from the local environment. Later, when the call function()
is made, the body of function
tries to fetch the value of local variable b it discovers that b is unbound.
To bypass this, you have to explicitly state that you are referring to the global variable.
a = 5
def function():
global a
print(a)
a = 10
function() // prints 5
The same thing happens in closures. you cannot define use a variable from an outside scope and define the same variable in the current scope
def f1():
a = 1
b = 2 def f2():
a += b
return a return f2()f1() // raises an error
UnboundLocalError Traceback (most recent call last)
<ipython-input-7-2e4e4113319b> in f2()
3 b = 2
4 def f2():
----> 5 a += b
6 return a
7 return f2()
UnboundLocalError: local variable 'a' referenced before assignment
To bypass this error, you have to explicitly state you’re not referring to this local variable.
def f1():
a = 1
b = 2 def f2():
nonlocal a
a += b
return a return f2()print(f1()) // prints 3
Other languages have different implementations scopes and variable access. For instance, with JavaScript, the two examples above will not raise an error because variables are hoisted. To read more about variable hoisting, please visit this site.
Although, this tutorial is strictly based on python, most of the information provided about scopes and closures is true for most programming languages.
Please feel free to leave any comments below, in case of any misrepresentation and misunderstanding of the subject matter.