Stop and Go — Regulating goroutines with a traffic light

Sean Yu
Xendit Engineering
Published in
4 min readNov 8, 2022

--

In Go, you can control the number of concurrent goroutines running based on your needs by using a buffered channel as a semaphore

Published also in my blog

Context: Batch Jobs

In Xendit, there are a ton of cron jobs running to process batch work in a deterministic interval fashion. One of the cron job’s work is to synchronize payment statuses, successful or not, by doing an HTTP call to our bank partners. Simplifying for this purpose, the code goes like this:

Problem: Scalability Issue

One morning, our team got alerted that some payments were not being synchronized in time. Some of my colleagues found out that it was due to the cron job reaching its set maximum duration to run. In this case, it was reaching the set activeDeadlineSeconds for k8s CronJob resource. Digging deeper, we discovered that the delay was caused by an influx of payments as it was payday and a sudden spike of latency from the third-party service API.

In short, It was an unexpected scalability issue on our end given the higher volume and spike in latency of third party services.

Naive Solution

First, naive attempt was to increase the activeDeadlineSeconds / max running duration of our cron job, which would cause longer latency for payments to be synchronized, which might impact user experience.

A naive solution was to spin up as much goroutine as needed to increase scalability and maximize our server resources. It goes something like below. Just fire as much goroutine as needed 🚀

The above has the following flaws:

  1. Our code above does not wait for all work/goroutines to be done before exiting the process
  2. Our code above does not consider the load capacity of downstream services. We have no control over 3rd party’s maximum server capacity, so we have to control the load we send to them.

Guide Questions

Let’s convert the flaws to these questions to give us a compass in implementing a solution:

  1. How do we wait for all goroutines to be finished before the process exits?
  2. How do we limit the number of concurrent goroutines/requests running?

Question 1: How do we wait for all goroutines to be finished before the process exits?

A quick answer to this is using wg.WaitGroup . It is a very convenient utility to “wait” given a number of “tasks” needed to be done. Let’s quickly see below:

Done!

If you are up for a challenge, you can also use go channels to do so. 👀 See the end of the post to see how

Question 2: Limiting # of running concurrent goroutines

There is an abstract data type called semaphore in concurrent programming we can use. Quoting from Wikipedia:

Semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system such as a multitasking operating system.

“Multiple processes” in this case are concurrently running goroutines. A semaphore can be a common resource for goroutines to refer to to behave in some desired fashion based on our needs. For this case, we can use a semaphore to limit the number of goroutines doing “active work” by controlling when a specific goroutine should do “active work”. Active work in this case is calling downstream services and doing updates to the payment

Active Work or Unit of work

Let’s say we want only 10 max concurrent requests, but we have 11 payments. We spin up a goroutine for each payment. What a semaphore needs to do is to not let the 11th goroutine do its work yet, until one goroutine finishes.

We can use go’s buffered channel as a semaphore. We leverage a buffered channel’s characteristic of being “blocking” on the sender’s end when the channel reaches its max length or capacity. Example:

(Note: above will be detected as a deadlock by go if you try to run it)

We can then modify our code to the below. I marked new lines of code with NEW:

Bonus: Implementing it as a Generic Function

To solidify our understanding of this pattern, we can try to create a “generic function” for learning purposes or for re-use if you think that’s best for your team and codebase.

Here is the code. You can run code below in https://go.dev/play/p/oOk489XFQBk

Question 1 Challenge: Using go channels to wait for

We can use a go channel as a medium for communication from the running goroutines to the main function. A goroutine sends a signal to the channel when it’s done processing its work. main the function waits for X signals based on the number of works before it exists.

--

--