A goroutine leak can easily be detected via an APM that monitors the number of live goroutines. Here is an example from NewRelic of a graph that monitors the goroutines:
A leak would lead to a continuous increase of this number until the server crashes. However, there are ways to prevent leaks even before the code is deployed.
The Go team at Uber, which is very active on Github, made their goroutine leak detector, a tool that aims to be integrated with the unit tests. This package actually monitors the goroutines that leak the current piece of code tested. Here is an example of a function with a goroutine leak:
And here is the test for that function:
Running the tests highlights the leak:
The message provides two useful pieces of information in the error message:
- The top of the stack of the leaking goroutine, along with the state of the goroutine. This information can help to debug quickly and understand which goroutine is leaking.
- The goroutine ID, useful when visualizing the execution with the tracer. Here is an example of the traces generated with the tests via
go test ./... -trace trace.out:
Then, from those traces, you can access the detailed execution of that goroutine.
The leaking goroutine has been detected, and we also have information about that leak. Let’s now understand how it works so we can be aware of the limitations of that detection.
The only requirement to enable the leak detection is to call the library at the end of the test to detect any leaking goroutine. Actually, it checks any extra goroutine rather than just leaking goroutine.
The leak detector first lists all the existing goroutines. Here is the list with the previous example:
The goroutines stack is provided by the exported function
runtime.Stack from the Go standard library. Therefore, it is accessible by anyone that would need this information. However, the goroutine IDs are not exported.
Then, from this list, the leak detector can make its analysis by parsing those lines and removing the goroutines that belong to the standard library such as:
- The goroutines created by the test package to run the tests — the second goroutine (#1) in the previous example.
- The goroutines created by the runtime, such as those that listen to the received signal.
For more details about it, I suggest you read “Go: gsignal, Master of Signals.”
- The current running goroutine — the first goroutine (#18) in the previous example.
Finally, once filtered, if no goroutine is remaining, that means no leak occurred. However, we can see it has some limitations:
- A third library or an internal goroutine that starts a background goroutine that is not properly handled will raise a false-positive report.
- False-positive happens if a goroutine leaks in another test that does not use the leak detector. If the goroutine still runs the next time the detector is used, it will falsely report the leak.
This tool is not perfect, but knowing the possibilities and limitations could help to detect leak from the tests, avoiding the need to debug a code already pushed to production.
It is interesting to note that this technique is also widely used in the
net/http package to spot the leaking goroutines. Here is an example of some tests:
Here again, the internal function
afterTest looks at the goroutines stacks to find the ones that have potentially leaked.