Go: What Does a Goroutine Switch Actually Involve?

Vincent Blanchon
Mar 1, 2020 · 5 min read
Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.13.

Goroutines are light; they just need a memory stack of 2Kb to run. They are also cheap to run; switching a goroutine to another one does not require many operations. Before jumping into the switch itself, let’s review how the switch works at a higher level.

Before continuing this article, I strongly suggest reading my article “Go: Goroutine, OS Thread and CPU Management” to understand the notions explained here.

Cases

Go schedules the goroutines onto the threads based on two kinds of breakpoints:

In both cases, the g0 that runs the scheduler replaces the current goroutine by another one, ready to run. Then, the chosen goroutine replaces g0 and runs on the thread.

For more information about g0, I suggest you read my article “Go: g0, Special Goroutine.”

Switch a running goroutine by another involves two switches:

In Go, a goroutine switch is really light. In order to save, it only needs two things:

Let’s see how it works in practice.

Program counter

For the sake of the example, I will use goroutines that communicate through a channel, one that produces data and some that consume them. Here is the code:

The consumers will basically print the even numbers from 0 to 99. We will focus on the first goroutine — the producer — that adds numbers to the buffer. When the buffer gets full, it will block when sending a message. At this point, Go has to switch to g0 and schedule another goroutine.

As seen previously, Go first needs to save the current instruction in order to restore the goroutine at the same instruction. The program counter (PC) is saved in an internal structure of the goroutine. Here is an example with the previous code:

The instructions and their addresses can be found with the command go tool objdump. Here are instructions of producer:

The program goes instruction by instruction before blocking on the channel at the function runtime.chansend1. Go saves the current program counter to an internal property of the current goroutine. In our example, Go saves the program counter with the address 0x4268d0 that is inside the runtime and the method runtime.chansend1:

Then, when g0 wakes the goroutine up, it will resume at the same instruction, looping on the values and pushing into the channel. Let’s move now to the stack management during the goroutine switch.

Stack

Before being blocked, the running goroutine has its original stack. This stack contains temporary memory like the variable i:

Then, when it blocks on the channel, the goroutine will be switched to g0 along with its stack, a bigger one:

Before the switch, the stack will be saved in order to be restored when the goroutine will run again:

We now have a complete view of the different operations involved in a goroutine switch. Let’s see now how it impacts performance.

We should note that some architecture— like arm — needs to save one more register, LR the link register.

Operations

To measure the time a switch could take, we will use the program seen previously. However, it will not give a perfect view of the performance since it can depend on the time it takes to find the next goroutine to schedule. This way the goroutine switch could also impact the performance; a switch from a function prolog has more operations to do than a switch from a goroutine blocking on channels.

Let’s summarize the operation we are going to measure:

Here are some results:

The switches from g to g0 or g0 to g are the fastest phases. They contain a small fixed number of instructions contrary to the scheduler that checks many sources to find the next goroutine to run. This phase could even take more time, according to the running program.

This benchmark gives an order of magnitude estimate of the performance. It should be taken with a pinch of salt; There is no standard tool to measure that. Also, the performance depends on the architecture, the machine (I’m running it on my Mac 2,9 GHz Dual-Core Intel Core i5.), and the running program.

A Journey With Go

A Journey With Go Language Programming

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store