Pointer receiver methods and iteration variables

One more gotcha with re-using range clause variables.

Michał Łowicki
golangspec
Published in
4 min readMar 10, 2020

--

This story will start with a puzzle (source code):

type T struct {
name string
}
func (t T) M(wg *sync.WaitGroup) {
fmt.Println(t.name)
wg.Done()
}
func main() {
var wg sync.WaitGroup
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
wg.Add(1)
go v.M(&wg)
}
wg.Wait()
}

Think for a moment about what will be written to stdout? The answer is not that simple. It can be actually many different things. Program will always print foo, bar and baz, each in separate line but the order might be different each time:

baz
foo
bar

It’s really up to the scheduler how to run goroutines. Scheduler isn’t the topic of this post though. Let’s modify that program slightly to use a pointer receiver instead (source code):

func (t *T) M(wg *sync.WaitGroup) {
fmt.Println(t.name)
wg.Done()
}

It also depends on runtime’s scheduler but usually I get the same line three times:

baz
baz
baz

If you got intrigued why it differs from the first example then keep reading. This post will explain what is really happening here.

defer vs go statements

It’s worth to mention that we’ll get similar behaviour if we’ll replace go statement with defer statement (source code):

type T struct {
name string
}
func (t T) M() {
fmt.Println(t.name)
}
func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}

Output:

baz
bar
foo

This is slightly different than code with go statement since output is deterministic — it doesn’t depend on the scheduler (deferred functions are called in LIFO order). If we’ll use pointer receiver then we’re also deterministic behaviour but outcome varies (source code):

type T struct {
name string
}
func (t *T) M() {
fmt.Println(t.name)
}
func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}

Output:

baz
baz
baz

It’s again the same line three times. To make our analysis slightly easier let’s focus on case with defer statement first since it gives deterministic behaviour.

Many people already know that loops and closures can be tricky (source code):

for _, v := range []string{"foo", "bar", "baz"} {
defer func() {
fmt.Println(v)
}()
}

Output:

baz
baz
baz

It isn’t anything specific to Golang. The same happens in other languages like JavaScript (source code):

(function() {
var funcs = [];
for (var v of ["foo", "bar", "baz"]) {
funcs.push(function() {
console.info(v);
});
}
funcs.forEach(f => f());
})();

In browser console you’ll see baz logged 3 times. With range clause we need to be extra careful since in Go iteration variables are re-used for each iteration (source code):

type T struct {
name string
}
func main() {
names := []T{{"foo"}, {"bar"}, {"baz"}}
for _, name := range names {
fmt.Printf("%p\n", &name)
}
}

Output:

0x40c138
0x40c138
0x40c138

Now that we got a refresher on closures and intricacies of range clause variables, let’s move on to method calls.

x.m()

Such call is valid if method set of x’s type contains m. What is a method set of arbitrary type T?

  • Method set of T contains all methods with receiver type T.
  • The method set of pointer type *T is the set of all methods declared with receiver *T or T.

Yes, there is asymmetry here. Language specification has additional rule:

If x is addressable and &x's method set contains m, x.m() is shorthand for (&x).m():

For example map elements aren’t addressable — https://play.golang.org/p/isfZwRiIL2a.

This extra rule is relevant in our case. What it actually means? Let’s analyse snippet from above (source code):

type T struct {
name string
}
func (t *T) M() {
fmt.Println(t.name)
}
func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}

It’s equivalent to:

func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer (&v).M()
}
}

Now it should be slightly clearer. In the examples where we see duplicated lines, methods on pointers to iterations variables are being called. Go doesn’t re-defines iteration variables — those are being re-used. Let’s add a bit more logging (source code):

for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
fmt.Printf("%p\n", &v)
defer (&v).M()
}

Output:

0x40c138
0x40c138
0x40c138
baz
baz
baz
State of memory after every iteration

Method is called on pointer and for every iteration we’re using the same pointer (referencing memory location of variable v). Deferred functions are fired after outer function finishes so v will be set to last element of slice — {"baz"}. This is why in stdout we see baz three times.

How is this different to the case when we’ve value receiver?

type T struct {
name string
}
func (t T) M() {
fmt.Println(t.name)
}
func main() {
for _, v := range []T{{"foo"}, {"bar"}, {"baz"}} {
defer v.M()
}
}

Now we aren’t using pointers but deferring method on copy of each slice element — v contains copy of {"foo"} during first iteration, {"bar"} during 2nd iteration and so forth. This is why we don’t have duplicates.

Examples with go statements are pretty much the same. The only difference is that scheduler may run methods in different order than slice is being traversed.

👏👏👏 👏👏 below to help others discover this story. Please follow me here or on Twitter if you want to get updates about new posts or boost work on future stories.

Resources

--

--

Michał Łowicki
golangspec

Software engineer at Datadog, previously at Facebook and Opera, never satisfied.