As Easy As Closure
Closure has been baffling mankind since its inception in JavaScript😉😁And yet if you look at it closely to understand how it works, it proves to be an incredibly elegant and simple construct. In this blog, along with theory, I am going to walk you through a handful of examples to build that mental, conceptual model that (hopefully) sticks around in your head, and you will never wonder what closure are, again.
These are the concepts that we are going to get comfortable with in this blog:
- Thread of execution
- Execution context
- Call stack
- Higher order functions
- Lexical scope and how JS is a lexically scoped language
- Closure
Embarking on the journey to understand closure
Let us start with a basic example first to understand how JavaScript’s thread of execution works:
- When this program executes, first the function definition for
createFunction
from lines 1 to 6 is stored in the global memory, with the identifiercreateFunction
. - Then in line 8, the identifier
generatedFunction
is created, and is uninitialised until thecreateFunction
runs and returns. We know thatcreateFunction
is about to run and not just referenced, because of the paranthesis with it in line 8. - When the
createFunction
is going to run, a new execution context will be created for it in memory, andcreateFunction()
will be added to the call stack. This is how the call stack looks at this moment:
- Note that the global() is always at the bottom of the call stack.
- Then the definition of
createFunction
will be looked up in the (global) memory and execution will start. Note that the thread of execution will not go from line 8 to 1. The thread just keeps moving ahead. To executecreateFunction
its definition is fetched from the memory. - Inside
createFunction
, first the definition ofmultiplyBy2
is stored in the local memory ofcreateFunction
(you can have a look at lines 2 to 4 to make a mental model of this, but the thread of execution, as I said before does not actually go to line 2–4). - Then the definition of
multiplyBy2
is picked up and returned to the calling environment, and stored in thegeneratedFunction
variable. In effect that means that the definition ofmultiplyBy2
gets a new label:generatedFunction
- Once the
return
keyword is hit on line 5 (that again means that you can look at line 5 to read what is returned but the thread of execution does not actually go to line 5 — it simply returns themultiplyBy2
function when it encounters thereturn
keyword in the function definition ofcreateFunction
coming from memory), the execution context ofcreateFunction
is deleted, and so is its entry on the call stack.
- Next in line 10, the identifier
result
is created and is unitialised till thegeneratedFunction
runs and returns. - A noteworthy point here is that the original environment where the definition of
generatedFunction
was created no longer exists. It is not even needed because the definition ofmultiplyBy2
now lives in the global memory under the labelgeneratedFunction
. generatedFunction
is run, its own execution context is created, and its entry added to the call stack.
- Inside the execution of
generatedFunction
, first the argument 5 is stored in its local memory with the labelnumber
. - Then the computation of
number * 2
is done which evaluates to 10. The value 10 is then returned into the calling environment, and stored in the variableresult
. - In the meanwhile, the execution context and the call stack entry for
generatedFunction
are deleted.
- In line 11, the value of
result
is logged out.
A noteworthy point in this discussion is that:
In line 5, the definition of multiplyBy2
and not its reference, was returned. Had the reference been returned, how would its invocation (with a new label generatedFunction
) in line 10 be possible? Remember, the execution context and local memory of createFunction
were deleted as soon as the function returned!!
So now we know what execution context, call stack are, and how functions are returned from other functions.
By the way, functions that accept and/or return other functions are called higher order functions.
Let us move ahead with another example:
This example builds on top of the basicExample.js
. For the sake of completeness, let us go through the mechanism of the thread of execution once again, a bit more quickly this time, because we understood it in a great detail the last time.
- Lines 1–10: definition of
createFunction
stored in memory. - Line 12: identifier
result
created, is still uninitialised tillcreateFunction
runs and returns.createFunction
starts to run, together with the creation of its own execution context and its entry in the call stack.
- Inside the execution of
createFunction
, identifiervalue
is created with the value 5. The definition ofmultiplyBy2
is stored with the identifiermultiplyBy2
. All this is stored in the local memory alloted tocreateFunction
.
Note again: when the function createFunction
is run, the thread of execution NEVER goes back from line 12 to line 1. When it has to execute this createFunction
, it executes it from the definition of createFunction
stored in the memory. I keep referring back to line 1 from 12, because this is how we see what is happening.
- Further when
multiplyBy2
is invoked, a new execution context is created for it and its entry pushed to the call stack.
- Inside the execution context of
multiplyBy2
, the value ofvalue
has to be updated to its double. Butvalue
does not exist in the local memory/execution context ofmultiplyBy2
.
What does the JS engine do in this situation?
Does it move down the call stack to look for the definition of value
?
You might be tempted to think yes, it does!
Alright, to make you happy I will assume for a minute that it does. So assumably, the definition of value
is found in the local memory of the parent context (that of createFunction
) and updated to be double of itself.
- Next, the closing brace of
multiplyBy2
is hit, the function ends, its execution context is deleted and it is popped off the call stack.
- Further in the execution of
createFunction
, the value ofvalue
is returned to the calling environment. - Line 12: Now the return value from
createFunction()
is stored with the identifierresult
in the global context. In the meanwhile, the execution context ofcreateFunction
too is deleted and it too is popped off the call stack.
- Line 13: The value of
result
is printed to the console.
Now coming back to the point where we made a big assumption: when a function cannot find the definition of a variable in its own local memory it keeps going one step at a time down the call stack to stop at the first location where it finds that variable definition.
Did we do the right thing in assuming that?
Instead of building up more suspense, I will simply delve into another example which can clarify things up.
I would like to leave the discussion of how the thread of execution proceeds and how the execution context and the call stack are handled as an exercise for you. You can refer to the discussions above to do it 😊
The point to be focused on in here is this:
In line 11, the createFunction
function returns the definition of multiplyBy2
function and stores it with the new label doubleItUp
. This returned function has references to a variable (value
) that is not defined inside its own local memory (see lines 4 to 7, but the thread of execution does not actually go there), but defined in the local memory of the context where multiplyBy2
was defined (i.e., in the context of createFunction
, i.e. its parent context).
Further, when this returned function is actually invoked, the context of createFunction
does not even exist any longer!!!
So how do we expect this returned function which is now stored in the global memory with the label doubleItUp
to run successfully and process the variable value
?
So what happens at line 13-14? If our earlier assumption were true, then when doubleItUp
is invoked, it tries to double the value of value
and return it. But in the first place, it cannot find the variable in its own execution context, so it (according to our assumption) tries to look one level down the call stack to find the variable. But at one level down the call stack, at this stage is… drum roll… the global
context!! Because the call stack entry of createFunction
has long been deleted when createFunction
returned!!
Digest this point for a moment. If needed go through the running of the execution context once more for this example…
Ready for the next thing? Alright, now that we are sure that the value
variable cannot come from the parent scope of multiplyBy2
and that this example does run successfully, the question arises: from where does the definition of value
come??
The answer, which you probably might have guessed is… from the CLOSURE of multiplyBy2.
Aha! Finally we got there!!
But wait a second. We do not know what a closure is in the first place.
So let us talk about it more formally.
When multiplyBy2
or for that matter any function, is created, it gets a small store of memory that holds the variables to which that function refers to in its definition. In JavaScript, this store of memory and also the overall concept is called Closure. It is like the surrounding environment is put inside a box, closed over and shipped together with the function, wherever the function goes.
The values held by the closure of a function come from the lexical scope of the function. The lexical scope of a function is the environment in which the function was created.
Let us correlate this with our example, closuresInFullForm.js
.
When createFunction
returned multiplyBy2
, it did not only return the definition of multiplyBy2
, but also a small store of references it held to in its definition. So this store contained the value
field and was shipped back into the global context along with the definition of multiplyBy2
.
After that as expected the definition of multiplyBy2
got a new label: doubleItUp
but the store (aka the closure) remained intact and unchanged. Whenever we wanted to run doubleItUp
which was formerly multiplyBy2
, the JS engine would need the reference to value
, which it first looked for in:
- the (newly created) execution context of
doubleItUp
, did not find it there. - then looked up in the closure found it there and processed it. Hence the output in line 14 is 10.
Phew!! That was quite some discussion.
To visualise how closure is created, you can think of the closure (of say processValues
function) as being an octopus with its tentacles sticking out to the references in the environment where processValues
was created. This environment is the lexical scope of processValues.
Important points to note in the last example:
- The fields in the local environment of
processValues
function which were not referenced byprocessValues
were not added to the closure. This is an optimisation measure adopted by the designers of JavaScript. Sovalue4
is lost forever oncecreateFunction
finishes execution. - The data stored in the closure store is private and cannot be accessed directly in any way. Only when
doCalculations
runs, do the values ofvalue1
,value2
, andvalue3
get updated. See output above. - It is the hidden
[[scope]]
property of the returnedprocessValues
function that makes the mechanism of closure possible. - This is the order in which the JS engine goes looking for references:
Why do we say that JavaScript is Lexically Scoped Language
This is because each function that is created in JS, holds its closure and this closure originates from the place where this function was originally defined, that is, the lexical scope of the function. So functions have, in a way some rememberance of their place of birth. This lexical scope has nothing to with the environment/scope where this function was called / invoked.
Here is an example:
- When in line 19
createFunction
runs, it returns theprocessValues
function definition plus its closure. - When
doCalculations
runs in line 21, its closure only remembers the lexical scope where it was defined originally. So its closure containsvalue1
,value2
andvalue3
from the (now deleted) execution context ofcreateFunction
. Remember the execution context ofcreateFunction
was deleted but the valuesvalue1
,value2
andvalue3
persisted in the closure. doCalculations
was called/invoked in the execution context ofglobal
. The data in global has no impact on the running ofdoCalculations
. Sovalue3
in the global scope has no effect ondoCalculations
.
Just a side note, had JS been a dynamically scoped language, the scope of the environment where the function was invoked would have had an impact on the running of a function.
Flexing our closure understanding-muscles with some more examples
Updating closure variable multiple times
Output:
Line 13 prints 1.
Line 14 prints 2.
Need a quick explaination?
Line 11 runs outer
and stores the function definition and closure of inner
with the label newFunction
. When newFunction
is run in line 13, it starts looking for the definition of counter
and finds it in the closure, counter
’s current value is 0, updates it to 1 and prints it. When newFunction
is run again in line 14, it again finds the definition of counter
in the closure, this time counter
’s existing value is 1, which is updated it to 2 and printed.
This is a handy way of remembering data from the previous runs of functions, formally known as memoization. Had newFunction
involved a complex and time consuming calculation, it would have been easy for the JS engine to remember the previous value from the previous run and to continue from that point in the next invocation.
Multiple functions can share the same closure
Output:
Pretty simple stuff, I leave the explaination of this example as an exercise to you.
The only special thing to note is that each function (inner1, inner2
and inner3
)in the array of returned functions has access to the exact same closure. So multiple functions returned together from a higher order function have the same closure.
Multiple closure instances
Now things are getting more exciting. In line 11, the inner
function was returned with its closure and given a new label newFunction1
. Invocations of newFunction1
on lines 12 and 13 update the counter variable in their closure and print it as 1 (in line 12) and 2 (in line 13).
Line 15 contains a fresh invocatoin of outer
which returns a fresh copy of inner
and and a fresh instance of closure, which gets a new label newFunction2
. So lines 16 and 17 print 1 and 2 again.
The closures associated with newFunction1
and newFunction2
are isolated from each other.
Another example of multiple closure instances
This time the output is: 1 on each of lines 12, 13, 16 and 17.
Line 11: outer
returns and the definition and closure of inner
function are stored with the label newFunction1
. Each time newFunction1
is invoked in lines 12 and 13, the function runs, but this time it need not go hunting for the definition of counter
in the closure. Its definition is found right there in the newFunction1
’s own local memory. Each time the newFunction1
is invoked, counter
is initialised to 0 afresh and incremented to 1.
In line 15, a new copy of inner
is returned with a new instance of closure to the label newFunction2
. The same process repeats again. Again, the thread of execution does not have to look beyond the local memory of newFunction2
for the definition of counter
. Each invocation of newFunction2
in lines 16 and 17 initialises counter to 0 afresh and increments it to 1 each time.
In both the above cases, the concept of closure is redundant.
Reading global data
Output:
Line 13: 1
Line 13: 2
Line 13: 3
Line 13: 4
This happens because this time each invocation of the returned inner
function, in the form of newFunction1
and newFunction2
updated and printed the global variable counter
.
Why? Because when newFunction1
or newFunction2
are invoked, they can not find the definition of counter
in their local memory or closure. So the JS engine looks for counter
in levels down the call stack, finds it in global and updates it.
Use cases of closure
Closure is important to understand several features of JavScript, like the following.
- Memoization: As explained above, giving our function persistent memory of their previous inputs and outputs.
- Iterators and Generators: Use lexical scoping and closure to achieve the most contemporary patterns for handling data in JS.
- Module Pattern: Preserve state for the lifetime of an application without polluting the global namespace.
- Asynchronous JavaScript: Callbacks and promises rely on closure to persist state in an async environment.
That is all folks. I know that was quite a lengthy read, but hopefully it cleared your mind about this esoteric feature of JS. If this feels quite a lot, I suggest you work through each of the examples with a pen and a paper and see how the thread of execution moves around.