Apr 6, 2014 · 3 min read

Paxos comes up often while talking about distributed systems. A number of interesting papers refer it. I finally got an opportunity to read the Paxos paper. Here’s my attempt to make sense of it:

The problem statement:

Paxos algorithm proposes a solution to the consensus problem encountered in distributed systems.

Briefly described: how can a number of nodes in a distributed network arrive at a consensus regarding a value? This is important for nodes to have a consistent view of the data. For example: imagine you deposited \$1000 in your bank account. The data is stored on multiple machines and they all need agree to increase your balance by \$1000.

And this is difficult to ensure because:

• The nodes may fail or temporarily stop working. The nodes might operate at different speeds.

Solution proposed:

Paxos splits the nodes into three groups and they work in two phases or arrive at consensus.

There are three groups of nodes -

• proposers: They put forward a proposal to the acceptors. The proposal is in the format — proposal( proposal number, value). The proposal number is a natural monotonically increasing number.

In practice, one node can belong to multiple groups.

The proposals are put forward to the acceptors, by the proposers, in the form of requests. These can be of two types:

• prepare request: It has a number n telling the acceptor not to accept a proposal less than number n. And it asks the acceptor to respond with proposal with the highest number less than n that it has accepted.

The algorithm works in two phases:

1. A proposer selects a number n and send a ‘prepare request’ to a majority of the acceptors. The acceptors agree to not accept any proposals lower than number n. The acceptors respond with the highest-numbered proposal less than n that they have accepted in the past.

A value is chosen by the Paxos algorithm when a learner discovers that a majority of acceptors have accepted a value.

This is the basic algorithm. The paper goes into some detail describing the correctness of the algorithm.

References and related reading:

Written by

Written by