Hashing in context of Load Balancing
One of the slightly tricky concepts to understand is hashing in the context of load balancing.
In order to understand this, please check out how hashing works at a conceptual level. The TL;DR is that hashing converts an input into a fixed-size value, often an integer value (the hash).
One of the key principles for a good hashing algorithm is that the function must be deterministic, which is a fancy way for saying that identical inputs will generate identical outputs when passed into the function. So, deterministic means — if I pass in the string “Hello” (case sensitive) and the function generates a hash of 11002, then every time I pass in “Hello” it must generate “11002” as an integer. And if I pass in “hello” it will generate a different number (consistently).
you might be wondering how to handle a situation where the hashing function generate’s the same hash for more than one input — this is not the end of the world and there are ways to deal with it. In fact it becomes more likely the more the range of unique inputs are. But when more than one input deterministically generates the same output, it’s called a collision.
few methods to handle collision are Chaining, Rehashing, Linear Probing, Quadratic Probing. feel free to read about them before moving on
Here we are going to only discuss about Server selection with hashing though there are other methods as well..
IP Hashing based Server selection
You can configure your load balancer to hash the IP address of incoming requests, and use the hash value to determine which server to direct the request too. If I had 5 servers available, then the hash function would be designed to return one of five hash values, so one of the servers definitely gets nominated to process the request.
IP hash based routing can be very useful where you want requests from a certain country or region to get data from a server that is best suited to address the needs from within that region, or where your servers cache requests so that they can be processed fast.
In the latter scenario, you want to ensure that the request goes to a server that has previously cached the same request, as this will improve speed and performance in processing and responding to that request.
If your servers each maintain independent caches and your load balancer does not consistently send identical requests to the same server, you will end up with servers re-doing work that has already been done in as previous request to another server, and you lose the optimization that goes with caching data.
Before we go further i want to point out that selecting a Server using hashing is just one of the methods discussed and there are few other ways like Round Robin , Weighted Round Robin, Load-based server selection, Path or Service based selection, Mixed Bag etc …
Hashing for Routing in LB:
With the above knowledge firmly in mind, let’s apply it to routing and directed requests to servers. Let’s say you have 4 servers to allocate loads across. An easy to understand method would be to hash incoming requests (maybe by IP address, or some client detail), and then generate hashes for each request. Then you apply the modulo operator to that hash, where the right operand is the number of servers.
request#1 => hashes to 34
request#2 => hashes to 23
request#3 => hashes to 30
request#4 => hashes to 14
// You have 4 servers => [Server A, Server B ,Server C ,Server D]
// so modulo 4for each request...
request#1 => hashes to 34 => 34 % 4= 3=> send this request to servers[4] => Server D
request#2 => hashes to 23 => 23 % 4= 2=> send this request to servers[2] => Server B
request#3 => hashes to 30 => 30 % 4= 4=> send this request to servers[4] => Server D
request#4 => hashes to 14 => 14 % 4= 3=> send this request to servers[3] => Server C
As you can see, the hashing function generates a spread of possible values, and when the modulo operator is applied it brings out a smaller range of numbers that map to the server number.
You will definitely get different requests that map to the same server, and that’s fine, as long as there is “uniformity” in the overall allocation to all the servers.
Adding Servers, and Handling Failing Servers
So — what happens if one of the servers that we are sending traffic to dies? The hashing function (refer to the pseudo code snippet above) still thinks there are 5 servers, and the mod operator generates a range from 0–4. But we only have 4 servers now that one has failed, and we are still sending it traffic. Oops.
Inversely, we could add a sixth server but that would never get any traffic because our mod operator is 5, and it will never yield a number that would include the newly added 6th server. Double oops.
/ Let's add a 5th server
servers => [Server A, Server B ,Server C ,Server D ,Server E]
// let's change the modulo operand to 5
request#1 => hashes to 34 => 34 % 5= 3=> send this request to servers[3] => Server D
request#2 => hashes to 23 => 23 % 5= 4=> send this request to servers[4] => Server E
request#3 => hashes to 30 => 30 % 5= 0 => send this request to servers[0] => Server A
request#4 => hashes to 14 => 14 % 5= 1=> send this request to servers[1] => Server B
We note that the server number after applying the mod changes (though, in this example,
In effect, the result is that the requests are now being routed to new servers altogether, and we lose the benefits of previously cached data on the servers.
For example, let’s assume request#4 used to go to Server E, but now goes to Server C. All the cached data relating to request#4 sitting on Server E is of no use since the request is now going to Server C. You can calculate a similar problem for where one of your servers dies, but the mod function keeps sending it requests.
It sounds minor in this tiny system. But on a very large scale system this is a poor outcome. #SystemDesignFail.
So clearly, a simple hashing-to-allocate system does not scale or handle failures well.
A popular solution — Consistent Hashing
Unfortunately this is the part where I feel word descriptions will not be enough. Consistent hashing is best understood visually.
The key problem with naive hashing, as we discussed, is that when
(A) a server fails, traffic still gets routed to it,
and
(B) you add a new server, the allocations can get substantially changed, thus losing the benefits of previous caches.
There are two very important things to keep in mind when digging into consistent hashing:
- Consistent hashing does not eliminate the problems, especially B. But it does reduce the problems a lot. At first you might wonder what the big deal is in consistent hashing, as the underlying downside still exists — yes, but to a much smaller extent, and that itself is a valuable improvement in very large scale systems.
- Consistent hashing applies a hash function to incoming requests and the servers. The resulting outputs therefore fall in a set range (continuum) of values. This detail is very important.
Please keep these in mind as you watch the below recommended video that explains consistent hashing, as otherwise its benefits may not be obvious.
I strongly recommend this video as it embeds these principles without burdening you with too much detail.
well its been a slightly long post than usual but if u have read so far then u deserve this 🍪
and as always …