Golang: Cost of using the heap

Published in

Invalid Memory

5 min readApr 7, 2020

Golang is a stack preferred language. Unlike other OOPs languages, like Java, it allocates space for its data on the stack rather than heap. The rationale Golang gives behind initializing its primitives and structs on the stack is saving the garbage collector operational cost. This significantly impacts the coding style compared to other languages which most newbie Go developers don’t realize. There are certain caveats of Go which moves the data to heap and operation suddenly become 5~10 times costly. There are many articles online which explains those caveats. But before we go there we feel the need to demonstrate the data limits at which stack and heap work efficiently.

First, let's understand what is the data size that Go prefers to keep on its stack comfortably. To understand this we have performed an escape analysis of the following code.

1  package main
2 
3  func main() {
4     u := make([]int, 8191) // Does not escape to heap
5     _ = u
6  
7     v := make([]int, 8192) // Escapes to heap = 64kb
8     _ = v
9  
10    var w [1024 * 1024 * 1.25]int
11    _ = w
12 
13    var x [1024 * 1024 * 1.25+1]int // moved to heap > 10 mb
14    _ = x
15 }

Output :

$ go test -gcflags ‘-m’ ./main.go
./main.go:3:6: can inline main
./main.go:13:6: moved to heap: x
./main.go:4:11: main make([]int, 8191) does not escape
./main.go:7:11: make([]int, 8192) escapes to heap

This experiment is carried out on 64-bit hardware using Go 1.13.4 version. On a 64-bit system, int size is 8 byte. We have analyzed the memory allocations of implicit (slices) and explicit (array) variables. Memory is allocated to an implicit variable (slices) on the stack until its size is below 64 KB. At 64 KB, the allocation escapes to heap. For explicit variable (array), memory is allocated on stack upto10 MB. Beyond that, it is moved to heap. We don’t understand the reason for allocating different sizes of implicit and explicit variables on stack. Although we have confirmed these figures in Go source code. The key point to note here is that usually Go keeps quite a high size on its stack (KBs and MBs depending on your choice of variable).

To understand the cost of using heap against stack, we need to first understand at least one of the caveats which move the data to heap. Let’s have another escape analysis of the program below.

1  package main
2 
3  func return80b() [10]int {
4     var s [10]int
5     return s
6  }
7
8  func return80bPointer() *[10]int {
9     var s [10]int
10    return &s
11 }

Output

$ go build -gcflags '-m' ./functions.go 
# command-line-arguments
./functions.go:3:6: can inline return80b
./functions.go:8:6: can inline return80bPointer
./functions.go:9:6: moved to heap: s
# command-line-arguments

The first function, return80b, returns the value of int array but the second function, return80bPointer, returns the pointer of the array. At the return, a function frame is marked invalid on the stack. In case of return of pointer, caller function still has to refer the data at the address. Therefore the array is moved to heap at line 9 in escape analysis.

Now we understand at least one way to keep data on heap vs stack. Our benchmarking will show here whether it is more efficient to keep the data on stack or moving to heap? We have performed a benchmarking for an array of size 80 B, 8 KB and 8 MB. Here is our final program -

package main

import "testing"

var kb8 [1024]int
var kb8Pt *[1024]int

var b80 [10]int
var b80Pt *[10]int

var mb8 [1024 * 1024]int
var mb8Pt *[1024 * 1024]int

func BenchmarkCopying80b(b *testing.B) {
   for n := 0; n < b.N; n++ {
      b80 = return80b()
   }
}

func BenchmarkCopying80bPointer(b *testing.B) {
   for n := 0; n < b.N; n++ {
      b80Pt = return80bPointer()
   }
}

func BenchmarkCopying8kb(b *testing.B) {
   for n := 0; n < b.N; n++ {
      kb8 = return8kb()
   }
}

func BenchmarkCopying8kbPointer(b *testing.B) {
   for n := 0; n < b.N; n++ {
      kb8Pt = return8kbPointer()
   }
}

func BenchmarkCopying8mb(b *testing.B) {
   for n := 0; n < b.N; n++ {
      mb8 = return8mb()
   }
}

func BenchmarkCopying8mbPointer(b *testing.B) {
   for n := 0; n < b.N; n++ {
      mb8Pt = return8mbPointer()
   }
}

func return80b() [10]int {
   var s [10]int
   return s
}

func return80bPointer() *[10]int {
   var s [10]int
   return &s
}

func return8kb() [1024]int {
   var s [1024]int
   return s
}

func return8kbPointer() *[1024]int {
   var s [1024]int
   return &s
}

func return8mb() [1024 * 1024]int {
   var s [1024 * 1024]int
   return s
}

func return8mbPointer() *[1024 * 1024]int {
   var s [1024 * 1024]int
   return &s
}

And here is the result of our benchmarking

BenchmarkCopying80b-12          317421898                3.63 ns/op
BenchmarkCopying80bPointer-12    50306946                23.8 ns/op
BenchmarkCopying8kb-12            6823657                 177 ns/op
BenchmarkCopying8kbPointer-12     1510377                 765 ns/op
BenchmarkCopying8mb-12                896             1234269 ns/op
BenchmarkCopying8mbPointer-12        1855              728266 ns/op

If you are not familiar with Benchmarking in Go then you just need to understand the significance of the last column in this benchmarking. The last column actually shows the time spent in nanoseconds for one operation of the function. We have compared 3 cases. Each case compares the cost of returning value vs pointer. When the array is returned as a value it copies all its entries to another variable. Here values are copied to a variable in the caller function. And, when a pointer is returned then data is moved to the heap and only its reference is copied to the variable in caller function, which is around 8 bytes only. So, this is actually a comparison between duplication vs dereferencing using heap.

The analysis results here that when data is around 80 B then it takes only ~3-4 ns in duplication against ~23-24 ns for dereferencing and moving data to heap which is almost 7~8 times high. When array size reached to 8 KB then it takes around 177 ns for duplication against 765 ns in dereferencing and moving data to heap which is ~4 times high. We also did a comparative analysis at 4 MB of array size. Both operations displayed almost similar time at that magnitude of data. We found that utilizing heap and dereferencing is efficient when array size starts touching 8 MB size. As it is apparent from the analysis that duplication of data took almost ~1.8 times more.

Conclusion

We came to the conclusion that Go is quite efficient in keeping data on the stack as long as data size is within the order of 1 MB. Although in this analysis we have made only one copy of data. The results can change drastically depending on the size of your call stack and the number of duplications performed against a single movement to heap and dereferencing everywhere else. So it is required to understand the depth of your own stack and benchmark accordingly. It is also important to understand the caveats of Go which moves the data from stack to heap. The second part of this post would focus on the caveats that developers often ignore and end up moving a lot of data to heap or exposing vulnerabilities. Enjoy coding in Go till then!

Golang: Cost of using the heap

Written by Anurag Shrinagar