A Refactor Story: Reducing Downstream Traffic with Go’s singleflight Package

Minhaj U. Khan
OneFootball Tech
Published in
6 min readOct 5, 2023
Photo by Nubelson Fernandes on Unsplash

Optimizing microservice communication is paramount in building scalable and efficient distributed systems. Effective communication between microservices is crucial for maintaining system reliability and performance. Let’s begin the story with this beautiful note.

The Beginning 🔍

The application I work on is a microservice — and just like any regular microservice out there — it talks to a bunch of other microservices to achieve its goal.
One random day, I was looking at our application’s traces, and I found out that our application was making the same request to another service in the system, multiple times.

Digging around 🕵

Now, after digging around the code and eyeing the trace, I figured that this HttpClient was being called by two different places to get []Items at the same time — or for a better word, concurrently.

package client

type Client interface {
GetItems(ctx context.Context, x string) ([]Items, error)
}

// HttpClient implements Client
type HttpClient struct{}


func(c *HttpClient) GetItems(ctx context.Context, x string) ([]Items, error) {
// MAKE HTTP CALL
// ...
}

After giving it much thought and consideration, I came to the unfortunate conclusion that there was no simple way, if any, to make those two different places coordinate with each other to call the GetItems API only once. 😢

Figure 1

ChitChat 💬

I decided to discuss this problem with a fellow engineer, Kshitij, and after a productive discussion and drawing things on paper — we decided that we should go with Go’s singleflightpackage as it fits the problem perfectly.

Here is an explanation of what one of its APIs does:

Do executes and returns the results of the given function, making sure that only one execution is in-flight for a given key at a time. If a duplicate comes in, the duplicate caller waits for the original to complete and receives the same results.

Action 👷

Here is what we did,

package client 

func(c *HttpClient) GetItems(ctx context.Context, x string) ([]Items, error) {
key := fmt.Sprintf("client.get_it_%sid", id)
v, err, _ := c.group.Do(key, fn func() (interface{}, error) {
// MAKE HTTP CALL
// ...
})
if err != nil {
return err
}
return v.([]Items), nil
}

The above implementation wraps up the actual HTTP call in a singleflight.Do function, so that it returns the original response to all the duplicate callers.

Now, when the two different places try to get []Items at the same time, the HttpClient does not actually make two HTTP calls, rather it returns the result to the duplicate invocation of the function. Problem solved!

Figure 2

Almost there 🏁

Pretty cool, right? however, there was a small catch.
After this change, our test suite started throwing data race errors, and — if you think about it — it does make sense for it to throw so.

[]Items were being used by two different threads (goroutines) at the same time, and when two different threads try to read from or write to the same data address at the same time, a data race is what has just happened.

Simple Solution? Send a copy of the data.

package client

func(c *HttpClient) GetItems(ctx context.Context, x string) ([]Items, error) {
key := fmt.Sprintf("client.get_it_%sid", id)
v, err, _ := c.group.Do(key, fn func() (interface{}, error) {
// MAKE HTTP CALL
// ...
})
if err != nil {
return err
}

// Make a copy of the result
shared := v.([]Items)
copied := make([]Items, len(shared)
copy(copied, shared)

return copied, nil
}

Here, we simply return the copy of []Item — so that both concurrent pieces of code have separate copies to mess around with.

No more data races, all tests pass! 🎉

Retrospective 🤓

But while we solved one problem, we created another one: violation of the single responsibility principle for the method GetItems.

The implementation for GetItems now contains two responsibilities

  1. Being thread-safe
  2. Make the HTTP request.

Time for Refactor 🧹✨

Ideally, what we want is a separation of concerns. The client should only be responsible for making requests – the added functionality of it being smart to return the same data and being thread-safe on top is not in the scope of the client’s responsibilities.

Solution? Decorators.

We create a new decorator that lifts the responsibility of making the underlying client thread-safe.

// sf_client.go
package client

type sfClient struct {
next Client
}

func WithSingleFlight(next Client) Client {
return &sfClient{next: next}
}

func (c *sfClient) GetItems(ctx context.Context, x string) ([]It, error) {
key := fmt.Sprintf("client.get_it_%sid", id)
v, err, _ := c.group.Do(key, fn func() (interface{}, error) {
return c.next.GetItems(ctx, x)
})
if err != nil {
return err
}
// Make a copy of the result
shared := v.([]It)
copied := make([]It, len(shared)
copy(copied, shared)

return copied, nil
}
Figure 3

And we keep the original client’s implementation thin and to the point, no fancy singleflight stuff here.

// client.go

func (c *HttpClient) GetItems(ctx context.Context, x string) ([]It, error) {
// MAKE HTTP CALL
// ...
}

Now, let's go to the places where this client is used.

These two places use the singleflight version of the client by decorating the original client with additional functionality.

// fetcher_one.go
package fetcher

type fetcherOne struct {
c client.Client
}

func NewFetcherOne(client Client) {
return &fetcherOne{client: client.WithSingleFlight(client)}
}
// fetcher_two.go
package fetcher

type fetcherTwo struct {
c client.Client
}

func NewFetcherTwo(client Client) {
return &fetcherTwo{client: client.WithSingleFlight(client)}
}

Some More Refactoring ✨

If you take a closer look, the implementation for singleflight lives inside the client package — which seems a bit off. The only thing the client package should be responsible for is to provide a client with which HTTP requests can be made.

The problem of concurrent invocations was introduced by the fetcher package, hence the solution should also live in the fetcher package.

As a final touch, we moved the sfClient decorator from the client package — to the fetcher package.

Figure 4

Summing The Refactoring Steps:

  • Add singleflight in the client code to solve the problem.
  • Make sure there are no data races.
  • Refactor the code to separate concerns.
  • Place the code in its appropriate package.

Notice that after these changes, the client implementation stays as it was. No changes are committed to the client package at the end of the day.

Learnings 📖

Finally, the most important part of this story — Learnings.

  • The identification of the problem would not have been possible without observability and monitoring in place. It is due to the tracing of HTTP transactions that we identified the problem.
  • It is important to discuss the problem with fellow engineers, had I been looking at this problem in a silo, I would have implemented a different solution, probably a bad one.
  • While running tests, always make sure that tests check for race conditions. All the assertions passed however the test suite failed because the code had race conditions.
  • One must check for code smells after the new additions. If so, plan for possible refactors.

--

--