Go: How Does the Goroutine Stack Size Evolve?

Vincent Blanchon
Jun 1, 2019 · 8 min read
Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.12.

Go provides a light and smart goroutines management. Light because the goroutine stack starts at 2Kb only, and smart since goroutines can grow / shrink automatically according to our needs.

Regarding the size of the stack, we can find it in runtime/stack.go :

// The minimum size of stack used by Go code
_StackMin = 2048

We should note that it has evolved through the time:

  • Go 1.2: goroutine stack has been increased from 4Kb to 8Kb.
  • Go 1.4: goroutine stack has decreased from 8Kb to 2Kb.

The stack size has moved due to the strategy of the stack allocation. We will go back to this topic later in this article.

This default stack size is sometimes not enough to run our program. This is when Go automatically adjusts the size of the stack.

Dynamic stack size

func main() {
a := 1
b := 2

r := max(a, b)
println(`max: `+strconv.Itoa(r))
}

func max(a int, b int) int {
if a >= b {
return a
}

return b
}

This first example just calculates the higher number among 2 integers. In order to know how Go manages the allocation of the goroutine’s stack, we can look at the Go’s assembler code with the command: go build -gcflags -S main.go. The output — I just left the lines that are related to the stack allocation — give us some interesting lines that can show what Go is doing:

"".main STEXT size=186 args=0x0 locals=0x70
0x0000 00000 (/go/src/main.go:5) TEXT "".main(SB), ABIInternal, $112-0
[...]
0x00b0 00176 (/go/src/main.go:5) CALL runtime.morestack_noctxt(SB)
[...]0x0000 00000 (/go/src/main.go:13) TEXT "".max(SB), NOSPLIT|ABIInternal, $0-24

There are two instructions that involves the stack changes:
- CALL runtime.morestack_noctxt: this method will increase the size of the stack if it needs more.
-NOSPLIT: this instruction means that the stack overflow check is not needed. It is similar to the compiler directive //go:nosplit.

If we look at the method runtime.morestack_noctxt, it will call the method newstack from runtime/stack.go:

func newstack() {
[...]
// Allocate a bigger segment and move the stack.
oldsize := gp.stack.hi - gp.stack.lo
newsize := oldsize * 2
if newsize > maxstacksize {
print("runtime: goroutine stack exceeds ", maxstacksize, "-byte limit\n")
throw("stack overflow")
}

// The goroutine must be executing in order to call newstack,
// so it must be Grunning (or Gscanrunning).
casgstatus(gp, _Grunning, _Gcopystack)

// The concurrent GC will not scan the stack while we are doing the copy since
// the gp is in a Gcopystack status.
copystack(gp, newsize, true)
if stackDebug >= 1 {
print("stack grow done\n")
}
casgstatus(gp, _Gcopystack, _Grunning)
}

The size of the current stack is first calculated from the boundariesgp.stack.hi and gp.stack.li that are pointers to the beginning and end of the stack:

type stack struct {
lo uintptr
hi uintptr
}

Then the current size is multiplied by 2 and checked if it does not exceed the max allowed size — that size depends on the architecture:

// Max stack size is 1 GB on 64-bit, 250 MB on 32-bit.
// Using decimal instead of binary GB and MB because
// they look nicer in the stack overflow failure message.
if sys.PtrSize == 8 {
maxstacksize = 1000000000
} else {
maxstacksize = 250000000
}

Now that we know the behavior, we can write a simple example to verify all of that. In order to debug, we will set the constant stackDebug that we have seen in thenewstack method to 1 and run:

func main() {
var x [10]int
a(x)
}

//go:noinline
func a(x [10]int) {
println(`func a`)
var y [100]int
b(y)
}

//go:noinline
func b(x [100]int) {
println(`func b`)
var y [1000]int
c(y)
}

//go:noinline
func c(x [1000]int) {
println(`func c`)
}

The instruction //go:noinline will avoid inlining all functions in the main function. If the inlining is done by the compiler, we will not see the dynamic growth of the stacks in each function prolog.

Here is a part of the debug we got:

runtime: newstack sp=0xc00002e6d8 stack=[0xc00002e000, 0xc00002e800]
stack grow done
func a
runtime: newstack sp=0xc000076888 stack=[0xc000076000, 0xc000077000]
stack grow done
runtime: newstack sp=0xc00003f888 stack=[0xc00003e000, 0xc000040000]
stack grow done
runtime: newstack sp=0xc000081888 stack=[0xc00007e000, 0xc000082000]
stack grow done
func b
runtime: newstack sp=0xc0000859f8 stack=[0xc000082000, 0xc00008a000]
func c

We can see that the stack has grown 4 times. Indeed, the function prolog will grow the stack as much as necessary to fit with the needs. As we have seen in the code, the stack size is defined by the boundaries of the stack, so we can calculate the new stack size in each case — the instruction newstack stack=[...]provides the pointers of the current stack boundaries:

runtime: newstack sp=0xc00002e6d8 stack=[0xc00002e000, 0xc00002e800]
0xc00002e800 - 0xc00002e000 = 2048
runtime: newstack sp=0xc000076888 stack=[0xc000076000, 0xc000077000]
0xc000077000 - 0xc000076000 = 4096
runtime: newstack sp=0xc00003f888 stack=[0xc00003e000, 0xc000040000]
0xc000040000 - 0xc00003e000 = 8192
runtime: newstack sp=0xc000081888 stack=[0xc00007e000, 0xc000082000]
0xc000082000 - 0xc00007e000 = 16384
runtime: newstack sp=0xc0000859f8 stack=[0xc000082000, 0xc00008a000]
0xc00008a000 - 0xc000082000 = 32768

The investigation in the internals did show us that the stack of a Goroutine starts a 2Kb and increased as much as necessary in the function prolog, added at the compilation, till the memory is enough or the limit of the stack is reached.

Stack allocation management

runtime: newstack sp=0xc00002e6d8 stack=[0xc00002e000, 0xc00002e800]
copystack gp=0xc000000300 [0xc00002e000 0xc00002e6e0 0xc00002e800] -> [0xc000076000 0xc000076ee0 0xc000077000]/4096
stackfree 0xc00002e000 2048
stack grow done
runtime: newstack sp=0xc000076888 stack=[0xc000076000, 0xc000077000]
copystack gp=0xc000000300 [0xc000076000 0xc000076890 0xc000077000] -> [0xc00003e000 0xc00003f890 0xc000040000]/8192
stackfree 0xc000076000 4096
stack grow done

The first instruction shows the address of the current stack,stack=[0xc00002e000, 0xc00002e800] and will copy it to a new one twice as big,copystack [0xc00002e000 [...] 0xc00002e800] -> [0xc000076000 [...] 0xc000077000] , 4096 bits length as we have seen previously. Then the previous stack is now freed: stackfree 0xc00002e000. Here is a schema that could help to visualize what is happening:

Golang stack growth with contiguous stack

The instruction copystack copies the entire stack and will move all addresses to this new stack. We can verify that easily with the small modification of your code:

func main() {
var x [10]int
println(&x)
a(x)
println(&x)
}

It now prints the address of the value:

0xc00002e738
[...]
0xc000089f38

The address 0xc00002e738 is contained in the first stack address we saw stack=[0xc00002e000, 0xc00002e800], while 0xc000089f38 is included in the last stack boundaries stack=[0xc000082000, 0xc00008a000]that we have in the debug trace. It confirms that all values have been moved from stack to stack.

Also, it is interesting to note that the stack will shrink, if needed, when the garbage collection is triggered.
In our example, after the function call, there is no other valid frames than the main one in the stack, so the system will be able to shrink it if the garbage collector runs. For that, we can just force the garbage collector to run:

func main() {
var x [10]int
println(&x)
a(x)
runtime.GC()
println(&x)
}

The debug trace now displays the shrink of the stack:

func c
shrinking stack 32768->16384
copystack gp=0xc000000300 [0xc000082000 0xc000089e60 0xc00008a000] -> [0xc00007e000 0xc000081e60 0xc000082000]/16384

As we can see, the stack size has been divided by 2 and re-used a previous stack address stack=[0xc00007e000, 0xc000082000]. Here again we can see in the runtime/stack.go — shrinkstack() that the shrink always divides the current size by 2:

oldsize := gp.stack.hi - gp.stack.lo
newsize := oldsize / 2

Contiguous stack VS segmented stack

func aruntime: newstack framesize=0x3e90 argsize=0x320 sp=0x7f8875953848 stack=[0x7f8875952000, 0x7f8875953fa0]
-> new stack [0xc21001d000, 0xc210021950]
func b
func c
runtime: oldstack gobuf={pc:0x400cff sp:0x7f8875953858 lr:0x0} cret=0x1 argsize=0x320

The current stack stack=[0x7f8875952000, 0x7f8875953fa0] is 8Kb in length (8192 bytes + the size of the top of the stack) and the new stack created is 18864 bytes (18768 bytes + the size of the top of the stack). The memory to be allocated is the following:

// allocate new segment.
framesize += argsize;
framesize += StackExtra; // room for more functions, Stktop.
if(framesize < StackMin)
framesize = StackMin;
framesize += StackSystem;

For the constants,StackExtra is set to 2048, StackMin is set to 8192, and StackSystem is set to a minimum of 0 till more than 512.
So, our new stack is composed as: 16016 (frame size) + 800 (arguments) + 2048 (StackExtra) + 0 (StackSystem).

Once all the functions are called, the new stack is now freed (logruntime: oldstack ). This behavior was one of the reasons that pushed Golang team to move to a contiguous stack:

Current split stack mechanism has a “hot split” problem — if the stack is almost full, a call will force a new stack chunk to be allocated. When that call returns, the new stack chunk is freed. If the same call happens repeatedly in a tight loop, the overhead of the alloc/free causes significant overhead

Go had to increase the minimum of the stack in 1.2 to 8Kb for this reason and was later able to reduce it back to 2Kb after the implementation of the contiguous stack.

Here is an update of our previous graph with the segmented stack:

Golang stack growth with segmented stack

Conclusion

If you want to go deeper into the stack details, I also suggest you read the blog post by Dave Cheney that talks about the redzone, along with the post from Bill Kennedy that explains the frames in the stack.

A Journey With Go

A Journey With Go Language Programming

Vincent Blanchon

Written by

French Gopher in Dubai

A Journey With Go

A Journey With Go Language Programming

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade