A pattern for limiting the number of goroutines in execution.
The goal of this post is to show a pattern for allocating a set amount of goroutine to a job queue and clean up after the job completion. I will concentrate in particular on how to work around deadlocks, how to clean up after goroutines, and I will show a dangerous pattern that can originate deadlocks at runtime.
The goal is to send a set number of jobs to a pool of workers. Our motivation arose from using
net/http to perform an API call to a third party service and noticing that at runtime we would get a
too many files error.
The root of the problem is that too many goroutines are opened at the same
time and they hit a system limitation. This kind of limitation can possibly be hit in the context of many different kind of services. While it is in some cases possible to stretch a system limitation, it is far better practice to control the use of resources at the software level. You could have an hard limits of database connections, memory, and cpu resources, that cost money, or are impractical to lift.
In go, a
send operation on a channel expects a
receive operation and viceversa. The following snippet halts in a deadlock, because, when
true is sent to a
boolchannel, the program waits forever for a
The use of the term deadlock is justified by the fact that it can be caused by two goroutines waiting for each other to release a lock on a
send/receive operation. This resembles what is commonly understood as a deadlock in programming: think about deadlocks in a relational database.
For solving the deadlock it is sufficient to use a goroutine. Any function in go can become a goroutine by appending the
go keyword in front of the call. A goroutine is a lightweight thread and allows the program to continue running after the function invocation. Hence if we perform a
send on the channel
a in a goroutine, we allow the execution to continue and arrive to the
receive operation that releases the lock on the
send operation. Fixing the previous snippet is straightforward.
Given a set number of tasks,
numberOfJobs, we will execute them with a controlled number of workers, say
Our worker listens to a channel and
send its result to a channel.
Queueing the jobs.
main() , we spin up a set number of workers and then queue the jobs in the channel
q. Notice that we also read the results of the job performed by the workers from the channel
done , otherwise we would have no way to control when the jobs are exhausted by the workers.
It is very educational to attempt and replace the go routines with regular functions in the queueing part of the code. Just remove the keyword
go from the above snippet.
Can you guess what happens before running the code? Try two cases
- Case 1,
numberofWorkers >= numberOfJobs.
- Case 2,
numberOfWorkers < numberOfJobs.
In case 1, all the jobs are executed, the program exits as expected.
In case 2, at runtime, the program will hit a deadlock after a few successful jobs are completed! This happens because no worker goroutine is ready to
q at the time we are sending. The program halts in a deadlock.
We want to stress how dangerous this pattern for deadlocks is. The compiler does not catch it, and in complex applications the deadlock could appear at runtime very sparsely, if the size of the queue of jobs does not commonly exceed the number of workers
A safe way to think about channels is to think about
sending/receiving as being one operation, in order to avoid the pitfall of leaving a loose operation at runtime.
This is yet another reminder that even if channels are great for tackling concurrency problems, the complexity of designing and implementing a program with concurrency does not magically disappear.
Cleaning after your goroutines.
After the job queue is emptied, we want to stop the workers. Goroutines are lightweight, but not free. It’s good practice to not leave anything unnecessary hanging, as at scale, even a small overhead can have a huge footprint. We can clean by using a kill channel to return from the goroutines.
This pattern for cleaning up loose goroutines is quintessential to go programs. Similar patterns can be encountered in go when using the
context package for handling request cancels or timeout for client-server communication in RPC services.