Go Sync or Go Home: WaitGroup

4 min readJun 15, 2023

Introduction

Go’s goroutines, channels, and mutexes make it easy to develop complex concurrency systems. Most problems can be solved using these three mechanisms, but you might be asking yourself — what else is out there?

That’s what I was wondering when I stumbled upon the lesser-known features of the sync and x/sync packages. In this blog series, I will explore some of these niche features, focusing on practical use cases and how they can be used to boost performance and reduce latency.

To give some background before jumping into the more advanced concepts, let’s start off by delving intoWaitGroup .

WaitGroup

WaitGroup can be used to wait for the completion of multiple concurrent tasks. Let’s take a look at its API:

Creation

WaitGroup doesn’t have a special initializer or creator function, so to create one, simply make a struct of the type:

wg := WaitGroup{}

Add

Call the Add method to add one or more tasks to wait for.

Done

Call the Done method in the task goroutine after the task has been completed.

Wait

Call the Wait method to block until all tasks have been completed.

Flow

Now that we’re familiar with the WaitGroup methods, the flow to use them will always be something along these lines:
1. Create a new WaitGroup
2. Add the number of tasks to be executed
3. In task goroutine: call Done after completing the task
4. In main goroutine: Wait for all tasks to finish

Example

Say we have an AgentController that controls multiple Agents:

Diagram showing one AgentController connected to multiple Agents — AgentController Diagram

The AgentController sends each task it receives to all agents and waits for their response. Once all agents have completed the task, the controller can continue to the next task.

Using goroutines and channels, we can implement the controller’s logic:

func ExecuteTask(task Task, agents []Agent) {
   // Create channel buffering up to len(agents) values so that we don't block when trying to insert a value
   agentsDone := make(chan struct{}, len(agents))
  
   // Execute tasks
   for _, agent := range agents {
      go func(agent Agent) {
         // Send value over channel to signal the agent has finished
         defer func() { agentsDone <- struct{}{} }()
         agent.Execute(task)
      }(agent)
   }
  
   // Wait for all agents to complete the task
   for range agents {
      <-agentsDone
   }
}

While this implementation works, using WaitGroups will simplify and enhance the readability of the code:

func ExecuteTask(task Task, agents []Agent) {
   wg := sync.WaitGroup{}
   // Wait for len(agents) to finish
   wg.Add(len(agents))
  
   // Execute tasks
   for _, agent := range agents {
      go func(agent Agent) {
         // Update the waitgroup that this agent has finished
         defer wg.Done()
         agent.Execute(task)
      }(agent)
   }
  
   // Wait for all agents to complete the task
   wg.Wait()
}

Benchmarks

Another reason to use WaitGroup instead of channels is improved performance. To demonstrate this, I created a benchmark test.

The Task

The sample task I used for the test is the sha1.Sum function on the string "hello world":

func task() {
   sha1.Sum([]byte("hello world"))
}

The Test

I created two tests, one for WaitGroup and one for channels. Each test ran the task concurrently runCount times:

func testChannel(runCount int) {
   doneChan := make(chan struct{}, runCount)
          
   // Execute tasks
   for i := 0; i < runCount; i++ {
      go func() {
         defer func() { doneChan <- struct{}{} }()
         task()
      }()
   }
    
   // Wait for all tasks to complete
   for i := 0; i < runCount; i++ {
      <-doneChan
   }
}

func testWaitGroup(runCount int) {
   wg := sync.WaitGroup{}
   wg.Add(c.runCount)
        
   // Execute tasks
   for i := 0; i < c.runCount; i++ {
      go func() {
         defer wg.Done()
         task()
      }()
   }
        
   // Wait for all tasks to complete
   wg.Wait()
}

The runCount Variations

To check the results over varying runCount values, I used an array of benchmark cases:

var cases = []struct {
   runCount int
}{
   {runCount: 100},
   {runCount: 1000},
   {runCount: 10000},
   {runCount: 100000},
}

The Benchmarks

All that was left was to build the benchmarks. The benchmarks run a sub-benchmark for each benchmark case, running each test b.N times.


func BenchmarkChannel(b *testing.B) {
   for _, c := range cases {
      b.Run(fmt.Sprintf("runCount%d", c.runCount), func(b *testing.B) {
       for i := 0; i < b.N; i++ {
          testChannel(c.runCount)
       }
      })
   }
}

func BenchmarkWaitGroup(b *testing.B) {
   for _, c := range cases {
      b.Run(fmt.Sprintf("runCount%d", c.runCount), func(b *testing.B) {
         for i := 0; i < b.N; i++ {
            testWaitGroup(c.runCount)
         }
      })
   }
}

The Results

❯ go test benchmark_test.go -bench=.
goos: linux
goarch: amd64
cpu: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz
BenchmarkChannel/runCount100-12                    26094             45337 ns/op
BenchmarkChannel/runCount1000-12                    4104            283114 ns/op
BenchmarkChannel/runCount10000-12                    445           2649410 ns/op
BenchmarkChannel/runCount100000-12                    44          26031409 ns/op
BenchmarkWaitGroup/runCount100-12                  38778             30834 ns/op
BenchmarkWaitGroup/runCount1000-12                  4998            230853 ns/op
BenchmarkWaitGroup/runCount10000-12                  562           2147943 ns/op
BenchmarkWaitGroup/runCount100000-12                  54          21059307 ns/op
PASS
ok      command-line-arguments  10.742s

The results show that the WaitGroup tests performed consistently better than the channel tests. This isn’t surprising — WaitGroup was built with this specific use case in mind. These benchmarks show us the significance of identifying the most appropriate synchronization technique for each situation and leveraging the versatile capabilities offered by Go’s concurrency primitives.

You can see the full benchmark tests here.

Summary

In summary, WaitGroup has one very specific use, and that is to wait for concurrent tasks to be completed. Its API is simple, and I highly recommend using it when the need arises.

What’s Next?

While WaitGroup is a great concurrency mechanism, sometimes it’s not enough. Stay tuned for the next post in this series, where we will explore ErrGroup and see how it saves the day at times when you need a few extra features!

Thank you so much for reading my first ever blog post! If you have any questions please reach out :)