How Consistent Hashing Is Used by Load Balancers to Distribute Requests

Load balancers and consistent hashing in six minutes

Zeng Hou Lim
Feb 14 · 6 min read
Photo by Chaitanya Tvs on Unsplash

Vertical vs. Horizontal Scaling

In a monolithic architecture, clients typically make requests to one single server. As the number of requests starts to scale, the single server does not have sufficient capacity to serve all the incoming requests.

Vertical v.s. horizontal scaling

Redirecting Requests With a Load Balancer

When we scale horizontally, the requests are directed to the load balancer instead of the servers directly.

Hash function and modulo (%)

All incoming requests, which will have a unique identifier (e.g. IP address), are assumed to be uniformly random.

  1. Output % number of servers -1 → server ID

Visualizing the mapping

Let’s take a step back to visualize how we could possibly use an array as a data structure to map each request to the server.

Array data structure to map output to servers

What Happens When We Add an Additional Server?

So far, we have assumed a fixed number of servers. However, since we opted for horizontal scaling, we should be able to add or remove servers as we wish.

Cost of change

This redirection may be seemingly trivial, but there are costs involved when servers are not stateless. Although HTTP is a stateless protocol, some servers may choose to store some user-related data in their cache for optimizations.


Consistent Hashing

The solution is to use consistent hashing. Let’s first try to visualize the concept in three steps.

Step 1: Map request to location on the ring

Now, instead of a regular array, let’s imagine a circular array. Similar to an array, each request would now map to a location on the hash ring.

A hash ring where each client request maps to an index

Step 2: Map server to location on the ring

Since each server has an ID, we can apply the same hash and modulo function that was applied to the IP address to the server IDs. Let’s assume that the chosen hash function is optimal and we do not have collisions between the IP address and server ID.

Certain indices are now mapped to servers

Step 3: Move clockwise

Now that we have requests and servers mapped out on a ring, the final step is simple.


How Does Consistent Hashing Minimize Impact on Other Servers?

Since requests are served by the immediate right-most server, at most one other server will be impacted by a change in the number of servers.


Virtual Nodes

Requests are not uniformly random

In an ideal world, the requests are uniformly random and each server has a uniform load.

Use multiple hash functions

The short answer is yes. Recall that each hash function is different and returns a different output.


Final Notes

I’m really new to system design myself but, lately, I’ve taken an interest in understanding these high-level architectures. Through writing and explaining, I get a deeper understanding of the topic, and I hope that it helped you in understanding the concepts too.


Better Programming

Advice for programmers.

Zeng Hou Lim

Written by

Software Engineer at LeanData. Excited about living my best life and becoming a better engineer. I like taking complex ideas and breaking them down.

Better Programming

Advice for programmers.

More From Medium

More from Better Programming

More from Better Programming

More from Better Programming

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade