Go: How Does Go Stop the World?

Vincent Blanchon
Jan 15, 2020 · 4 min read
Image for post
Image for post
Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.13.

A “Stop the World” (STW) is a crucial phase in some garbage collector algorithms to get track of the memory. It suspends the execution of the program to scan the memory roots and add write barriers. Let’s review how it works internally and the potential issues it could face.

Stop the World

Stopping the program means stopping the running goroutines. Here is a simple program that will “Stop the World”:

Running the garbage collector will trigger two “Stop the World” phases.

For more information about the garbage collector cycle, I suggest you read my article “Go: How Does the Garbage Collector Mark the Memory?

The first step of this phase is to preempt all running goroutines:

Image for post
Image for post
goroutines preemption

Once the goroutines are preempted, they will be stopped at a safe point. Meanwhile, the processors P — running code or in the idle list — will be marked as stopped to not be used to run any code:

Image for post
Image for post
P are marked as stopped

Then, the Go scheduler will run and detach each M from their respective P and put them in the idle lists:

Image for post
Image for post
Ms are moved to the idle list

Regarding the goroutines running on each M, they will wait in the global queue:

Image for post
Image for post
Goroutines are waiting in the global queue

Then, once the world is stopped, the only active goroutine can safely run and start the world when the work is done. The tracing will help to see when this phase happens:

Image for post
Image for post
Stop the World “STW” phase in the tracing

System calls

The phase “Stop the World” could impact system calls as well since they could return while the world is stopped. Let’s take an example with a program that intensively does system calls and see how it is handled:

Here is the tracing:

Image for post
Image for post

The system call here is exiting while the world is stopped. However, since there is no available P — they are all marked as stopped, as seen in the previous section — the goroutine will be put in the global queue and will run later when the world resumes.

Latencies

The third step of the “Stop the World” involves the M’s to be all detached from their P. However, Go will wait for them to stop voluntarily: when the scheduler runs, during syscall, etc. Waiting for a goroutine to be preempted should be fast, but in some cases, it could lead to some latencies. Let’s take an example that will show an extreme case:

Here, the “Stop the World” phase takes 2.6 seconds:

Image for post
Image for post

A goroutine without function calls will not be preempted, and its P will not be released before the end of the task. That will force the “Stop the World” to wait for it.

ℹ️ The issue raised in this section is now fixed in Go 1.14 with the asynchronous preemption. Indeed, Go is now able to send signals to each P to preempt the running goroutine. For more details about asynchronous preemption, I suggest you read “Go: Asynchronous Preemption.”

A Journey With Go Language Programming

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store