Go: Goroutine and Preemption

ℹ️ This article is based on Go 1.13.
ℹ️ Go implements an asynchronous preemption in Go 1.14, making some part of this article obsolete. However, those sections will be marked as it, making the article still useful to understand the need for the asynchronous preemption.
For more details about asynchronous preemption, I suggest you read “Go: Asynchronous Preemption.”
Go manages the goroutines thanks to an internal scheduler. This scheduler aims to switch goroutines between them and make sure they all can get runnable time. However, the scheduler could need to preempt the goroutines to establish a correct turnover.
Scheduler and preemption
Let’s use a simple example to show how the scheduler works:
For ease of reading, the examples will not use atomic operations.
func main() {
var total int
var wg sync.WaitGroup
for i := 0; i < 10; i++ {
wg.Add(1)
go func() {
for j := 0; j < 1000; j++ {
total += readNumber()
}
wg.Done()
}()
}
wg.Wait()
}
//go:noinline
func readNumber() int {
return rand.Intn(10)
}
Here is the tracing:

We clearly see that the scheduler rotates goroutines on the processors, giving running time to all of them. To alternate the running time, Go schedules the goroutines when they stopped due to a system call, blocking on channel, sleeping, waiting on a mutex, etc. In the previous example, the scheduler benefits from the mutex in the number generator to give running time to all of the goroutines. This can also be visualized in the tracing:

However, Go also needs a way to stop a running goroutine if it does not have any pause. This action, called preemption, allows the scheduler to switch goroutines. Any goroutine running for more than 10ms is marked as preemptible. Then, the preemption is done at the function prolog when the goroutine’s stack is increasing.
Let’s look at an example of this behavior with the previous lock, modified not to be used anymore, from the number generators:
func main() {
var total int
var wg sync.WaitGroup
for i := gen(0); i < 20; i++ {
wg.Add(1)
go func(g gen) {
for j := 0; j < 1e7; j++ {
total += g.readNumber()
}
wg.Done()
}(i)
}
wg.Wait()
}
var generators [20]*rand.Rand
func init() {
for i := int64(0); i < 20; i++ {
generators[i] = rand.New(rand.NewSource(i).(rand.Source64))
}
}
type gen int//go:noinline
func (g gen) readNumber() int {
return generators[int(g)].Intn(10)
}
Here is the tracing:

However, the goroutines are preempted at the function prolog:

This check is automatically added by the compiler; here is an example of the asm code generated by the previous example:

The runtime ensures the stack can grow by inserting instruction on each function prolog. This also allows the scheduler to run if necessary.
Most of the time, the goroutines will give the scheduler the ability to run all of them. However, a loop without function calls could block the scheduling.
Forcing preemption
Let’s start with a simple example that shows how a loop could block the scheduling:
func main() {
var total int
var wg sync.WaitGroup
for i := 0; i < 20; i++ {
wg.Add(1)
go func() {
for j := 0; j < 1e6; j++ {
total ++
}
wg.Done()
}()
}
wg.Wait()
}
Since there are no function calls and the goroutines will never block, the scheduler does not preempt them. We can see that in the tracing:

However, Go provides several solutions to fix this issue:
- Forcing the scheduler to run thanks to the method
runtime.Gosched()
:
for j := 0; j < 1e8; j++ {
if j % 1e7 == 0 {
runtime.Gosched()
}
total ++
}
Here is the new tracing:

- Using the experimentation that allows loops to be preempted. It can be activated by rebuilding the Go toolchain with the instruction
GOEXPERIMENT=preemptibleloops
or adding the flag-gcflags -d=ssa/insert_resched_checks/on
while usinggo build
. This time, the code does not need to be modified; here is the new tracing:

When preemption is activated in the loops, the compiler will add a pass when generating the SSA code:

This pass will add instructions to call the scheduler from time to time:

For information about the Go compiler, I suggest you read my article “Go: Overview of the Compiler.”
However, this approach could slow the code down a bit since it forces the scheduler to trigger probably more often than necessary. Here is a benchmark between the two versions:
name old time/op new time/op delta
Loop-8 2.18s ± 2% 2.05s ± 1% -6.23%
ℹ️ The issue raised in this section is now fixed with Go 1.14 and the asynchronous preemption. However, the two solutions explained here are still valid. runtime.Gosched()
can be used to trigger the scheduler, and the preemptible loops option is still part of the standard library.
Incoming improvements
As of now, the scheduler uses cooperative preemption techniques that cover most of the cases. However, in some unusual cases, it can become a real pain point. A proposal for a new “non-cooperative preemption” has been submitted that aims to solve this problem as explained in the document:
I propose that the Go implementation switch to non-cooperative preemption, which would allow goroutines to be preempted at essentially any point without the need for explicit preemption checks. This approach will solve the problem of delayed preemption and do so with zero runtime overhead.
The document suggests several techniques with the advantages and their drawbacks and could land in the next versions of Go.