Consistent Hashing made easy

Sutanu Dutta
5 min readFeb 21, 2023

--

We have learnt about quite a few distributed system concepts in earlier articles and there we tried to solve issues that arise from dealing with more than one node.

Today we will deep dive into what fuels web caches that give faster accessibility to internet while dealing with petabytes of data. Web caches intensively use consistent hashing concept to manage huge volume of data using commodity hardware by distributing load evenly.

What is hashing anyway?

Before we dive into the concept of consistent hashing we need to get an understanding of what is hashing?

It’s a process to transform one value to another and create a mapping between them.

There are standard hashing algorithms available like MD5, SHA and CRC etc.

Why do we need something complicated sounding like consistent hashing?

As we understand the hash algorithm returns the hash value and we use it for further operations. We always bound our hash range based on our requirement; for example if want our hash function to return a value which we can map to our server ids and if we only have 100 servers, we need to make sure our hash range is 100(modulo function). Once we identify the server we can perform any kind of operations like retrieving cached data, cached web pages, cached files or any information we would prefer to store in that server.

This is a very efficient way to handle large volume of data which can’t fit into single server without compromising performance only if we live in an ideal world where there is 0% chance of a failure.

Unfortunately we don’t live in an ideal world so let’s consider a failure scenario. Let’s say we have 10 TB of data against all our keys and we have 10 servers to serve our clients; we can evenly distribute 1 TB data to each server considering all the servers have same configuration. On a given day out of 10 servers 1 server crashed due to hardware failure. Earlier we were dealing with 10 servers and hence our hash address range was limited to 10 but unfortunately now we have 9 servers at hand. Naturally we would need to rehash our entire dataset to map them to the correct server and that includes physically moving the data from one server to another.

The above mentioned issue is not acceptable in any real world application as the data copy would take longer and large number of incoming client requests would bring down the backend services due to cache miss.

Let’s analyse the problem statement:

  1. Rehashing the entire dataset is expensive operation and it takes longer and it is CPU intensive.
  2. We are moving data from a server which is functioning due to rehashing; an effort which doesn’t yield any values.
  3. In real world applications it’s pretty normal for services to scale up and down(adding and removing additional nodes) which would trigger rehashing multiple times and cause the issue mentioned in point 1.

Consistent hashing solves these issues very efficiently and in the next section we will dissect the inner working of it.

What the heck is consistent hashing?

In order for our solution to work in real world we need to avoid rehashing the full dataset. Instead of maintaining a map like structure where every data key would give a defined server id in response of hash function; we would start maintaining a ring like hash space(a range of hash values). It could be close to MAX_INT if needed. In this case we would also run the same hash function on server names, server ip addresses or on any such identifier and based on the result we would tag the server in a specific position in the hash ring. Same way all the servers would be mapped to distinct values in the ring based on their identifier.

Now the natural question is — all sounds good but what is the logic to store data in the server as we don’t have a hash function to identify the server anymore.

We don’t need to find the actual server id as consistent hashing works on ranges; any hash value that falls between previous server’s hash value(clockwise or anticlockwise direction) and current server’s hash value would be stored in the current server.

Now let’s look at some of the use cases:

Use case 1: A client requests with a new key:

  1. We will run the hash function on the key and derive the hash value.
  2. Using nearest neighbour algorithm we will identify the nearest hash value that maps to a server (utilising global mapping data of hash values and corresponding server id)
  3. Add the data in the server(first time) and next time onward the data would get fetched from this server.

Use case 2: A new server gets added:

  1. We will calculate hash value of the server based on the identifier.
  2. Add the server hash value and server identifier in the global mapping data.
  3. We will iterate clockwise and anticlockwise direction till we find next hash value that maps to a server.
  4. All the elements/data get copied to the adjacent server for this range.

Use case 3: A server goes down:

  1. We will calculate the hash value of the server based on its identifier.
  2. We will take the hash value of the server and move clockwise or anticlockwise direction till we find the next hash value which maps to a server and copy the data into the adjacent server.
  3. Remove the server hash value from global mapping data.

Key points to note:

To make the process efficient we are maintaining global mapping data of hash values and server ids.

Using binary search to identify the nearest neighbour server efficiently(O(logN)).

We must have a predefined pattern of traversing the ring either in clockwise direction or anticlockwise direction.

We have reduced the data movement in case of rehashing, drastically as we would move only the impacted server data which is a fraction of total data.

Issues with this approach and solution:

As we see if any node goes down the adjacent server gets a 2X load which may cause an outage due to capacity issue and the ripple effect would eventually bring down the entire system.

The solution to this problem is as follows:

  1. Create multiple segments(i segments) of individual physical nodes. We would address them as virtual nodes.
  2. Use multiple hash functions or make recursive call i- times to the same hash function to assign the right segment.
  3. The hash ring would have the virtual nodes distributed randomly in it.

Workflow:

  1. Use binary search to identify the nearest virtual node.
  2. Use virtual-physical server hash value mapping to identify the physical server.
  3. Copy/Fetch data to/from physical server.

If any server goes down; all the virtual nodes of that server would go down but as they are distributed randomly the load of those segments would also get distributed evenly among remaining servers.

Fun facts:

Consistent hashing is used for web caching(Akamai)

Dynamo DB uses consistent hashing for load distribution.

Further reading:

  1. Stanford consistent hashing lecture
  2. The research paper by David Karger: Consistent hashing and random forest

--

--

Sutanu Dutta

Senior software engineer and system design enthusiast. I am passionate about Computer science and write about data structure and software architecture.