Carlo: carbon-aware load balancing

6 min readOct 17, 2022

Carbon awareness is an emergent concept gaining traction in the tech community. It is one of those concepts that sits at the convergence of Sustainability and Information Technology. This convergence is being referred to as Green Tech or Sustainable Tech, and it provides a model as to how technology can develop a more environmentally sustainable approach to meeting business needs.

One of the fundamental components of modern distributed architectures is a load balancer. A load balancer is a layer of software + hardware that sits in front of a pool of servers. Its main use case is to efficiently distribute incoming client requests to available backend servers. To do so, it executes different types of algorithms to determine which server should serve which request next.

In the basic diagram below, you can see how client requests arrive at the load balancer and how it acts as a proxy to nodes sitting behind which will ultimately handle those requests.

Under the hood, the load balancer runs an algorithm to determine which is going to be the next available server to which it should forward a request. One of the most commonly known is round robin, a technique where the load balancer will route requests to each server sequentially and then loop back to the first server.

The premise for this work is that replacing such a load balancing algorithm could be a starting point for developing a carbon-aware approach to routing traffic. Instead of determining server availability using round robin, carlo would rely on an external service to determine the carbon intensity of the different regions hosting available servers.

Carbon-aware approach for load balancing

This way, the load balancer can select the server with the lowest carbon intensity i.e. the region which is using the greenest electricity to power their cloud servers.

Carlo is a reference implementation for achieving this use case. It’s built in an experimentally naive way, abstracting away complex networking and cloud implementation details to allow for a main goal: understanding the concept of carbon-awareness in this particular niche of distributed computing.

It relies on the simplicity of Go language for its implementation and uses the net/http package to simulate load balancing while running the code locally. If you start from main.go you will see that it defines three different regions that will be initialized with an arbitrary number of servers running on them.

Architecture

Both the region and the server are part of the domain of this problem, and the relationship between them has been modeled as a region hosting N number of available servers running on it. In terms of Go language constructs, it defines a server interface that is implemented by a type backend. Moreover, the region represents yet another type.

Latency and carbon intensity are the main aspects to implement a carbon awareness algorithm. This naive algorithm approaches the task of selecting the optimal server as follows: it retrieves all available regions and chooses the one with the lowest carbon intensity. In real life this is what supposedly could be consumed from an external service, in the context of this experiment it is just a randomly generated int. Once carlo selects the region with the lowest carbon intensity, latency becomes the most important aspect, and the application chooses the server with the smallest latency in that particular region. Latency is also implemented as a randomly generated number.

search lowest carbon intensity (LCI) region from available regions
search smallest latency server (SLS) from the LCI
forward request to SLS

Since I am running this simulated distributed environment locally on my laptop, I needed a way to deploy different instances of this server interface implementation. For that, carlo relies on the HTTP server multiplexer type that allows it to run multiple HTTP servers at the same time on different ports. Each local server is analogous to a cloud server deployed in a region.

After servers are initialized and assigned a specific region, the load balancer itself starts with the appropriate handler to forward requests to the different servers.

The code is available here where you can find more information and provide feedback on the design decisions made for this project. The result of executing this application turns out to be a simple representation of what real load balancing could look like.

If I start it up, I’ll see first a message that the load balancer is listening on http://localhost:8000.

Everytime that I hit that URL with curl, the load balancer will execute its intended carbon-aware algorithm and forward the request to the optimal server.

In the terminal tab running the load balancer, I will also get some output regarding the current state of regional carbon intensity and server latency. Thus, you can verify the choices the algorithm makes.

Console output for selecting lowest carbon intensity

If you hit it again, you are going to get different (random) results, accounting for the changing nature of varying carbon intensity in real world electricity grids powering data centers.

Console output for selecting lowest carbon intensity, and latency

In this last case, the lowest carbon intensity region has two available servers, therefore carlo will select the one with the smallest latency, and you’ll get the same response in the client.

Carbon-aware response in client

Assumptions and trade-offs

Like I mentioned before, the intended goal is to bring a better understanding of carbon-awareness to the tech industry. That being said, there are some considerations to take into account.

Load balancing usually takes place among nodes within the same region, but different availability zones. I’m abstracting away from this aspect, but it is clearly a challenge. On top of that, regional carbon intensity varies more widely between country states, and significantly less within them. Therefore, one could make an argument that it’d be quite challenging to be carbon aware in this space with such limitations.

The algorithm can definitely be further improved. For example, responses from a carbon intensity API could be cached given that regional intensity might not change too frequently.

It could also be the case that the lowest carbon intensity region selected might have the least available servers to cope with current load. This is something to iterate upon as it’s a clear trade-off between sustainability as an architectural characteristic and server availability.

Moreover, relying on external services adds another layer and dependency to a software/hardware component that usually has a very demanding requirement in terms of availability and responsiveness. This is certainly an aspect to consider, regardless of how carbon awareness is implemented in distributed load balancing systems.

On a personal level, I am not an expert in Go language (heads up for some smells), but its simplicity allows me to understand and hopefully make people understand how the concept of carbon awareness and load balancing could be explored in the future.

Conclusion

Carbon awareness is critical for software intensive systems to tackle the growing energy consumption of IT infrastructure. There are multiple aspects to address carbon accountability in modern tech teams, like Chris Ford and I describe in our Green Team Topologies article.

Particularly, in the tech space it is not trivial how to solve some of these questions. However, I personally hope that this baseline implementation aids in shaking some ideas out of the community: both in building technically on top of it, but also in highlighting the importance of advocacy in the IT industry.

Carlo: carbon-aware load balancing

Written by Daniel Fratte