Write a Go Worker Pool in 15 minutes

At my job I was building a service in Go that needed to process a lot of user requests quickly. By encapsulating it with a goroutine pool, performance increased by 20x. This post will go over how to create a worker pool for your own projects.

If you want to download the final results check it out here.

In this post we will:

  • Create some mock work to do
  • Create the Worker Pool
  • Run Benchmark tests

This is the file structure:

/go_worker_pool
/work
work.go
/pool
worker.go
dispatcher.go
bench_test.go
main.go

The directory path that I will be building this:

go/src/github.com/Lebonesco/go_worker_pool

The final results will look something like this:

$ go run main.go
2018/10/06 15:53:43 starting application...
2018/10/06 15:53:43 starting worker: 1
2018/10/06 15:53:43 starting worker: 2
2018/10/06 15:53:43 starting worker: 3
2018/10/06 15:53:43 starting worker: 4
2018/10/06 15:53:43 starting worker: 5
2018/10/06 15:53:43 creating jobs...
worker [2] - created hash [2376065843] from word [iCMRAjWw]
worker [4] - created hash [121297580] from word [xhxKQFDa]
worker [1] - created hash [3193224551] from word [XVlBzgba]
worker [3] - created hash [1481401259] from word [hTHctcuA]
worker [5] - created hash [166906897] from word [FpLSjFbc]
worker [5] - created hash [1752784812] from word [QYhYzRyW]
...

Create Some Mock Work

To simulate a process that will take some time to complete, we’ll create a bunch of jobs represented as random strings that will have some work done on them, which in this case will be the generation of a string hash.

DoWork() takes in a string and worker id then computes a string hash and sleeps for a second before printing the work results. In order to print out when each job finishes set:

$ export DEBUG="true"

We make this configurable because when we later run benchmark tests this output will be unnecessary and clutter up the results.


Create the Worker Pool

Next, lets dive into building the actual worker pool. It will be made up of three parts: the dispatcher (instantiates and connects all of the workers with the worker pool), the workers (grabs a waiting job and then does work), and the collector (collects jobs and distributes them to workers).

The import things to note from the below code are the WorkerChannel in the Worker struct and the Start() function.

A popular phrase when it comes to Go is “Do not communicate by sharing memory; instead, share memory by communicating.” This is the purpose of the WorkerChannel and how the Worker Pool communicates with the workers. The WorkerChannel stores all of the channels of available workers. When a job arrives the collector pulls a worker channel from WorkerChannel and passes the job to it which is then received by the worker. Inversely, when the worker is finished, it pushes its channel back onto WorkerChannel to wait for the next job.

Awesome! We are close to being done. To finish up the Worker Pool lets complete the dispatcher and collector. Before going into the design I want you to be aware that there’s a lot of ways to design a worker pool in terms of how the pieces fit together.

Some ways work better than others depending on coding readability and how you want the code to be used. As long as you have the main pieces: worker, dispatcher, and collector you should be able to design your own. For example some people make their collector separate from their dispatcher. However, in my case I want to return a Collector struct from my dispatcher because I felt that it helped in terms of readability on the users end. The user in this case being anyone who wanted to use this worker pool with their own code.

Finally, lets create the driver that will run the application. It will consistent of a main() function that will instantiate the worker pool and create jobs.

Benchmark tests!

To make sure that this actually works lets run a simple benchmark test. Go makes this very easy by having a built in test framework.

Then we can run:

$ go test -bench=.
starting worker: 1
starting worker: 2
starting worker: 3
starting worker: 4
starting worker: 5
goos: windows
goarch: amd64
pkg: tutorials/concurrent-limiter
BenchmarkConcurrent-4 1 3001744600 ns/op
BenchmarkNonConcurrent-4 1 20006911100 ns/op
PASS
ok tutorials/concurrent-limiter 23.291s

Well, there we have it. The evidence is in. The worker pool does provide a significant performance improvement. In addition, the longer each single job takes to complete and more workers you have running the more significant performance increase you’ll see.

Thank you for taking the time to read my article.

If you found this article helpful, let me know 👏👏👏.

If you want to read more click the “follow” button below.