Pointer receiver methods and iteration variables
One more gotcha with re-using range clause variables.
This story will start with a puzzle (source code):
type T struct {
name string
}func (t T) M(wg *sync.WaitGroup) {
fmt.Println(t.name)
wg.Done()
}func main() {
var wg sync.WaitGroup
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
wg.Add(1)
go v.M(&wg)
}
wg.Wait()
}
Think for a moment about what will be written to stdout? The answer is not that simple. It can be actually many different things. Program will always print foo
, bar
and baz
, each in separate line but the order might be different each time:
baz
foo
bar
It’s really up to the scheduler how to run goroutines. Scheduler isn’t the topic of this post though. Let’s modify that program slightly to use a pointer receiver instead (source code):
func (t *T) M(wg *sync.WaitGroup) {
fmt.Println(t.name)
wg.Done()
}
It also depends on runtime’s scheduler but usually I get the same line three times:
baz
baz
baz
If you got intrigued why it differs from the first example then keep reading. This post will explain what is really happening here.
defer vs go statements
It’s worth to mention that we’ll get similar behaviour if we’ll replace go statement with defer statement (source code):
type T struct {
name string
}func (t T) M() {
fmt.Println(t.name)
}func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}
Output:
baz
bar
foo
This is slightly different than code with go statement since output is deterministic — it doesn’t depend on the scheduler (deferred functions are called in LIFO order). If we’ll use pointer receiver then we’re also deterministic behaviour but outcome varies (source code):
type T struct {
name string
}func (t *T) M() {
fmt.Println(t.name)
}func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}
Output:
baz
baz
baz
It’s again the same line three times. To make our analysis slightly easier let’s focus on case with defer statement first since it gives deterministic behaviour.
Many people already know that loops and closures can be tricky (source code):
for _, v := range []string{"foo", "bar", "baz"} {
defer func() {
fmt.Println(v)
}()
}
Output:
baz
baz
baz
It isn’t anything specific to Golang. The same happens in other languages like JavaScript (source code):
(function() {
var funcs = [];
for (var v of ["foo", "bar", "baz"]) {
funcs.push(function() {
console.info(v);
});
}
funcs.forEach(f => f());
})();
In browser console you’ll see baz
logged 3 times. With range clause we need to be extra careful since in Go iteration variables are re-used for each iteration (source code):
type T struct {
name string
}func main() {
names := []T{{"foo"}, {"bar"}, {"baz"}}
for _, name := range names {
fmt.Printf("%p\n", &name)
}
}
Output:
0x40c138
0x40c138
0x40c138
Now that we got a refresher on closures and intricacies of range clause variables, let’s move on to method calls.
x.m()
Such call is valid if method set of x’s type contains m. What is a method set of arbitrary type T?
- Method set of T contains all methods with receiver type T.
- The method set of pointer type *T is the set of all methods declared with receiver *T or T.
Yes, there is asymmetry here. Language specification has additional rule:
If x is addressable and &x's method set contains m, x.m() is shorthand for (&x).m():
For example map elements aren’t addressable — https://play.golang.org/p/isfZwRiIL2a.
This extra rule is relevant in our case. What it actually means? Let’s analyse snippet from above (source code):
type T struct {
name string
}func (t *T) M() {
fmt.Println(t.name)
}func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}
It’s equivalent to:
func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer (&v).M()
}
}
Now it should be slightly clearer. In the examples where we see duplicated lines, methods on pointers to iterations variables are being called. Go doesn’t re-defines iteration variables — those are being re-used. Let’s add a bit more logging (source code):
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
fmt.Printf("%p\n", &v)
defer (&v).M()
}
Output:
0x40c138
0x40c138
0x40c138
baz
baz
baz
Method is called on pointer and for every iteration we’re using the same pointer (referencing memory location of variable v
). Deferred functions are fired after outer function finishes so v
will be set to last element of slice — {"baz"}
. This is why in stdout we see baz
three times.
How is this different to the case when we’ve value receiver?
type T struct {
name string
}func (t T) M() {
fmt.Println(t.name)
}func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}
Now we aren’t using pointers but deferring method on copy of each slice element — v
contains copy of {"foo"}
during first iteration, {"bar"}
during 2nd iteration and so forth. This is why we don’t have duplicates.
Examples with go statements are pretty much the same. The only difference is that scheduler may run methods in different order than slice is being traversed.
👏👏👏 👏👏 below to help others discover this story. Please follow me here or on Twitter if you want to get updates about new posts or boost work on future stories.