Use Pointer of for Range Loop Variable in Go
This is an article of my go deeper series.
Motivation
The other day, I tracked down a bug when I was trying to use for range loop over an array and get the pointer of the element in Go. The following is a simplified version:
This will yield the following output:
$ go run buggyLoop.go
Dog with name <{Ghost}> and pointer: <0xc000010200>
Dog with name <{Bruno}> and pointer: <0xc000010200>
Dog with name <{Lucky}> and pointer: <0xc000010200>
dogPtr with name <Lucky> and pointer: <0xc000010200>
dogPtr with name <Lucky> and pointer: <0xc000010200>
dogPtr with name <Lucky> and pointer: <0xc000010200>
Notice that in the first for loop between line#12 to #15, it prints different value each iteration, but the pointer is always the same!
And in the second loop between line#17 to #19, checking the dogPtrs
, we find out it is all filled with “Lucky”, the last entry of dogs
, but I would expect there are three different dogs.
After some googling, I find this fix from stackoverflow:
The fix seems really simple but also mysterious: a one-liner e := e
in line#13 fix the problem.
The output now is what we expect:
Dog with name <{Ghost}> and pointer: <0xc00008e1e0>
Dog with name <{Bruno}> and pointer: <0xc00008e200>
Dog with name <{Lucky}> and pointer: <0xc00008e230>
dogPtr with name <Ghost> and pointer: <0xc00008e1e0>
dogPtr with name <Bruno> and pointer: <0xc00008e200>
dogPtr with name <Lucky> and pointer: <0xc00008e230>
So why does the buggyLoop.go
not work as expected and what the heck is this one-liner in fixedLoop.go
? Now it’s time to go deeper.
Iteration variables
When we use the keyword range
, the values on the left side of the short variable declaration :=
are called iteration variables. For instance in line#12 in buggyLoop.go
:
for _, e := range dogs {
Here _
and e
are iteration variables
, while _
means we don’t want to use the first iteration variable, which is the index in this case.
A “for” statement with a “range” clause iterates through all entries of an array, slice, string or map, or values received on a channel. For each entry it assigns iteration values to corresponding iteration variables if present and then executes the block.
So it turns out such iteration variables are re-used. It essentially just re-assign the value each iteration. This explains why does buggyLoop.go
not work and filled with “Lucky”, the last entry of the array.
The iteration values are assigned to the respective iteration variables as in an assignment statement.
The iteration variables may be declared by the “range” clause using a form of short variable declaration (
:=
). In this case their types are set to the types of the respective iteration values and their scope is the block of the "for" statement; they are re-used in each iteration. If the iteration variables are declared outside the "for" statement, after execution their values will be those of the last iteration.
Function block, for block and inner block in for loop
With that in mind, let’s investigate what the one-liner e := e
do and why it fix the “Lucky” issue.
To explain it, let’s see another code example:
This yields the following output:
$ go run shadow.go
e has value <0> with ptr <0xc000014080>
e has value <1> with ptr <0xc000014088>
e has value <1> with ptr <0xc0000140a0>
e has value <2> with ptr <0xc000014088>
e has value <2> with ptr <0xc0000140a8>
e has value <0> with ptr <0xc000014080>
Here is what we have done: in the function foo
we first create a variable e
in line#8 with value 0
. Then we have a for
statement where we create an iteration variable e
in line#11 . After that in the for
loop, we create yet another variable e
in line#13. Those three e
are different variables but have the same name, we can tell those are different variables through the pointer value. Notice that after the for loop, the variable e
in line#17 is the same as in line#8 and still has value 0
.
This is known as variable shadowing.
In computer programming, variable shadowing occurs when a variable declared within a certain scope (decision block, method, or inner class) has the same name as a variable declared in an outer scope.
I must say it is not recommended to do such thing like in our shadow.go
code example because it could significantly reduce the code readability.
BTW If you don’t know how scope works in Go, you can check it here:
Go is lexically scoped using blocks (…)
But you should be okay to continue if you understand the previous code example.
So the reason I do this code example is to convince you that in Go, a function has a block, the iteration variable in for
statement then has its own inner block. And the statements within for
loop is yet another inner block. To virtualise it:
Through this illustration you should now understand why the for
statement block is called an “implicit” block, since it is not surrounded by matching brace brackets { ... }
, whereas the other two blocks in this example are.
A block is a possibly empty sequence of declarations and statements within matching brace brackets.
Block = "{" StatementList "}" .
StatementList = { Statement ";" } .
(…)
Each “if”, “for”, and “switch” statement is considered to be in its own implicit block
And we can redeclare a variable (variable shadowing) in an inner block.
An identifier declared in a block may be redeclared in an inner block. While the identifier of the inner declaration is in scope, it denotes the entity declared by the inner declaration.
— The Go Programming Language Specification
Short variable declarations may appear only inside functions. In some contexts such as the initializers for “if”, “for”, or “switch” statements, they can be used to declare local temporary variables.
So now we understand the one-liner e := e
is a redeclaration, a variable shadowing. Since it is a redeclaration, each iteration we actually create a new local variable e
, therefore the pointers to the local variables are also different in each iteration. That’s why we have all three different dogs in fixedLoop.go
.
To make it more clear, we could just use another name like dog
to avoid such variable shadowing since it is really a different variable anyway:
Summary
The buggyLoop.go
does not work because iteration variables of
for range loop are re-used each iteration. It is the same variable and it is re-assigned each iteration, therefore the pointer array dogPtrs
is filled with three “Lucky”, the last entry of dogs
.
The one-liner e := e
in fixedLoop.go
redeclares the variable and shadow the iteration variable. Since it is a redeclaration, the e
inside for loop is just another variable but with the same name as the iteration variable e
in the for
statement. And we could also give it another name like dog := e
to make it more clear.
The one-liner fix the problem because the redeclaration creates a new variable each iteration and has the value of iteration variable. Thus in the fixedLoop.go
we have different pointers in dogPtrs
.
Thank you for reading and stay safe :)
Reference
Copying the address of a loop variable in Go, the stackoverflow page where I find this one-liner fix. Though it does not explain the detail why it works, it points out the direction I could later go deeper.
Golang mixed assignment and declaration, the stackoverflow page explains variable shadowing in Go with some neat examples.
The Go Programming Language Specification, the official language spec every gopher should read through.