Consensus, Paxos and Consistent Hashing

Published in

stackspacearena

6 min readApr 24, 2024

Consensus

in a distributed system refers to the process by which multiple nodes or processes in the system reach an agreement on a certain value or decision, even in the presence of faults or failures. The goal of achieving consensus is to ensure that all nodes in the system agree on the same value, even if some nodes fail or behave unpredictably.

Consensus is crucial in distributed systems because it enables coordinated actions among nodes, which is essential for maintaining consistency and reliability. Some common examples of decisions that require consensus in distributed systems include electing a leader, committing a transaction, or agreeing on the state of shared data.

Here are some key concepts related to consensus in distributed systems:

Fault Tolerance: Consensus algorithms are designed to tolerate failures and ensure that the system continues to operate correctly even if some nodes fail or exhibit faulty behavior.
Safety and Liveness: Consensus algorithms must satisfy two important properties: safety and liveness. Safety ensures that all correct nodes agree on the same value, while liveness ensures that the system eventually makes progress and reaches a decision.
Quorum: In many consensus algorithms, nodes communicate with each other and make decisions based on a quorum, which is a subset of nodes that must agree on a value for the decision to be considered valid. Quorums help ensure that decisions are made even if some nodes are unreachable or fail.
Leader Election: Many consensus algorithms use a leader election process to coordinate the decision-making process. The leader is responsible for proposing values and coordinating the agreement among nodes.
Atomicity: Consensus algorithms ensure that decisions are made atomically, meaning that either all nodes agree on the same value or none of them do. This prevents partial or inconsistent decisions from being made.

Some well-known consensus algorithms used in distributed systems include Paxos, Raft, and the Byzantine Fault Tolerance (BFT) family of algorithms. These algorithms provide different trade-offs in terms of simplicity, fault tolerance, and performance, but they all aim to achieve consensus among distributed nodes in the system.

Paxos

Proposed by Leslie B. Lamport, Paxos is a consensus algorithm used in distributed computing to ensure consistency among a network of computers. Here’s a simplified explanation of how Paxos works:

Proposal Phase: A node (called the proposer) selects a proposal number and sends a prepare request to a majority of acceptors.
Promise Phase: If an acceptor receives a prepare request with a higher proposal number than any it has seen before, it responds with a promise not to accept any proposals with a lower number. If it has already promised not to accept a proposal with a higher number, it ignores the request.
Accept Phase: If the proposer receives promises from a majority of acceptors, it sends an accept request with its proposal to those acceptors. If an acceptor receives an accept request for a proposal number it has not promised to ignore, it accepts the proposal and informs the proposer.
Learn Phase: Once a proposal has been accepted by a majority of acceptors, it is considered chosen, and all nodes learn about the chosen value.

Consistent Hashing

Consistent hashing is a technique used to distribute data across multiple nodes in a distributed system while minimizing the need to rebalance when nodes are added or removed. It’s not directly derived from Paxos, but they are both used in distributed systems to address different challenges.

A Consistent Hash ring ( source : https://cassandra.apache.org/)

In consistent hashing, each node in the system is assigned a range of hash values. When a key needs to be stored or retrieved, it is hashed, and the corresponding node responsible for that hash range handles the operation. This ensures that only a fraction of the keys need to be remapped when a node is added or removed, reducing the amount of data migration required.

Consistent Hash Ring: A consistent hash ring is a specific implementation of consistent hashing where the hash values are arranged in a ring topology. Here’s how it works:
Hashing Nodes: Each node in the system is assigned a unique identifier, typically derived from its IP address or some other identifier.
Hashing Keys: When a key needs to be stored or retrieved, it is hashed to produce a hash value.
Locating Nodes: Starting from the hash value of the key, the ring is traversed clockwise until the first node with an identifier greater than or equal to the hash value is found. This node is responsible for the key.
Adding or Removing Nodes: When a node is added or removed, only the keys that were originally mapped to that node and its immediate neighbors need to be remapped, minimizing the amount of data migration required.

While consistent hashing and Paxos are both used in distributed systems, they serve different purposes. Consistent hashing is used for distributing data across nodes efficiently, while Paxos is used for achieving consensus among distributed nodes.

Let’s go through a couple of examples to illustrate how consistent hashing works and how redistribution works when nodes are added or removed.

Example 1: Consistent Hashing with Key Distribution**

Let’s say we have a system with 4 nodes in a consistent hash ring, labeled A, B, C, and D. Each node is assigned a range of hash values in the ring.

- Node A: Hash range [0, 100)
- Node B: Hash range [100, 200)
- Node C: Hash range [200, 300)
- Node D: Hash range [300, 400)

Now, let’s suppose we want to store a key “example_key”. We hash the key to determine which node should be responsible for it:

- Hash(“example_key”) = 150

Starting from the hash value (150), we traverse the ring clockwise until we find the first node with an identifier greater than or equal to 150. In this case, it’s Node B. So, Node B is responsible for storing and retrieving “example_key”.

Example 2: Redistribution in Consistent Hashing

Now, let’s see how redistribution works when a new node is added to the system. Let’s add a new node E with the hash range [250, 350).

- Node A: Hash range [0, 100)
- Node B: Hash range [100, 200)
- Node C: Hash range [200, 250) <- Affected by redistribution
- Node E: Hash range [250, 350)
- Node D: Hash range [350, 400)

When Node E joins the system, the keys that were originally mapped to Node C and its immediate successor (Node D) need to be redistributed. For example, if a key “new_key” originally belonged to Node C but now falls into the range of Node E, it will be moved to Node E.

- Hash(“new_key”) = 220
- Node E takes responsibility for “new_key”.

After redistribution, the consistent hash ring is updated, and the keys are redistributed accordingly, minimizing the impact on the overall system and reducing the need for data migration.

In summary, consistent hashing allows for efficient distribution of keys across nodes in a distributed system while minimizing the need for data redistribution when nodes are added or removed.

Paxos and consistent hashing in Distributed Systems tech stack

Distributed Databases:

Paxos: Distributed databases often use Paxos or variations of it (like Raft) for leader election, coordination, and ensuring consistency across nodes. For example, Apache Cassandra employs Paxos-based protocols for achieving consensus on data updates across nodes in the cluster.
Consistent Hashing: Distributed databases use consistent hashing to partition data across nodes in the cluster. For instance, MongoDB uses consistent hashing to shard data across multiple replica sets, ensuring that each node is responsible for a specific range of data.

Content Delivery Networks (CDNs):

Paxos: CDNs may use Paxos or similar algorithms for ensuring consistency and fault tolerance in their distributed systems. For example, Cloudflare uses Paxos-based protocols to replicate and synchronize content across its edge locations.
Consistent Hashing: CDNs leverage consistent hashing to route user requests to the nearest edge server efficiently. For instance, Amazon CloudFront employs consistent hashing to distribute content across its edge locations, reducing latency and improving performance for end-users.

Distributed Key-Value Stores:

Paxos: Distributed key-value stores often use Paxos for achieving consensus on key operations such as writes and replication. For example, Google’s Spanner uses Paxos for transaction coordination and ensuring consistency across its globally distributed storage infrastructure.
Consistent Hashing: Key-value stores utilize consistent hashing to partition data across multiple nodes, enabling horizontal scalability and fault tolerance. For instance, Riak, a distributed key-value store, uses consistent hashing to distribute data across its cluster of nodes.