Your Journey To Consensus (Part 1) — Crash Fault Tolerance and Paxos

Rachid CHAMI
6 min readOct 3, 2019

--

In every decentralized system, and today in every blockchain, the notion of consensus is always present and the protocols behind are always being used. In fact, the importance of consensus in a decentralized system is that it lets you have the same version of data on the whole network. Thus, a whole system acts like one computer but with more capabilities.

So, in order to get a better hold of how a blockchain works, or more generally how a decentralized system with many nodes manages to have the same version of the data being traded, you will need to understand consensus and how it works.

In this series of articles we will try to demystify the notion of consensus via going over couple algorithms, how they work and some details about them.

To do so, we will follow the following outline:

  • What is consensus
  • Different kinds of consensus algorithms
  • Crash Fault Tolerant algorithms: Raft and Paxos
  • Byzantine Fault Tolerant algorithms: PBFT, HotStuff

Hope you will enjoy this journey.

What is consensus?

A consensus algorithm is a mechanism used in distributed systems to agree on the same data among different computers. This allows a system to operate like a single computer but with more resources like number of requests it can handle.

Such mechanisms are used widely whenever there is a need to have a single truth. So we find them when it comes to a distributed database like hadoop with Apache Zookeeper, Cassandra with Paxos, and recently with Blockchains like Bitcoin with Proof of Work, Facebook Libra with HotStuff and so on… These systems require consistency over the network in order for clients request to be safe and have the same response on every node.

Different Kinds of consensus algorithms

With the birth of different systems that require consensus, different solutions have emerged solving the problems they face to have an agreement on the same truth over the whole network.

This ended up giving us a consensus algorithms zoo, each solving a certain type of problems in a very specific way and with different implementation specifications, messaging complexity etc.

https://101blockchains.com/wp-content/uploads/2018/08/Different_Consensus_Algorithms.png

this image resumes somehow the type of existant algorithms.

In our articles we will be focusing on two types of algorithms:

  • Crash Fault Tolerant algorithms
  • Byzantine Fault Tolerant algorithms

Crash Fault Tolerant Algorithms

Crash fault tolerant is a name given the algorithms that solve the problem of nodes crashing simply. So we have a decentralized system where nodes can halt or disconnect from the network and yet we maintain the same state of truth on the system.

To solve this kind of failures, Paxos family of algorithms appeared in the 80s by Leslie Lamport. And then after years, another Crash Fault Tolerant algorithm appeared named RAFT which is built mainly for understandability.

Paxos

Single Decree Paxos

Single Decree Paxos assumes a collection of processes that can propose values and it manages to ensure that only a single one gets chosen. Its correction has been proven and is efficient in the normal case.

Its safety requirements are as follow:

  • Only a value that has been proposed may be chosen;
  • Only a single value is chosen;
  • A process never learns that a value has been chosen until it actually has been.

And to achieve these requirements, Single-Decree Paxos defines three node’s roles:

  • Proposer: Chooses a value and sends it to a set of acceptor
  • Acceptor: may accept the received value or reject it
  • Learners: They adopt the value when a large enough set of acceptors have accepted it.

And we also add the following assumptions:

  • Acceptors can accept more than just one value. This is due to the fact that acceptors might receive different values and end up in a non consistent state, ie each acceptor waits to verify the value it received while no value is being chosen by the majority of the acceptors.
  • Assign numbers to each proposal for the algorithm to have a criteria to switch from value to value. Eg, if an acceptor receives value x with proposal number y, but after a period of time it receives another value with proposal number higher than y, it will switch to the new value. This ensures that the system will eventually converge to the same value.
  • If a proposal with value x is chosen, then every higher-numbered proposal that is chosen has value x. This means that if an acceptor chooses a value then every next value it will receive that can be accepted, ie having a higher-numbered proposal, should have the same value or else it should switch to it.
  • If a proposal with value x is chosen, then every higher-numbered proposal accepted by any acceptor has value x.

These assumptions rely on the fact that single-decree paxos only tries to converge to a single value. With this in mind, we can see that in a legitimate system, there will be eventually a time where proposers will cease to propose and all of them will start waiting for a value to be chosen… In this time, the acceptors will start switching to higher-numbered proposals until they reach the maximum it can get, ie highest proposal number. Then the network will converge to this value.

The algorithm

Single decree Paxos relies on the following two phases algorithm:

Phase1

  • A proposer selects a proposal number n and sends a “prepare” request with number n to a majority of acceptors.
  • If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered proposal that it has accepted.

Phase2

  • If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with the highest-numbered proposal among the responses.
  • If an acceptor receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request having a number greater or equal than the proposal number.
https://blog.the-pans.com/paxos-explained/

So as far as i understood from this algorithm, though i’m still too far to understand paxos and how it works, the one who sent the proposal with the highest proposal number will end up winning the game unless he fails before sending the requests to the other nodes of the network or some network issue occurs resulting in messages loss.

Multi Decree Paxos

There exist the multi-decree paxos that lets a system decide on a serie of values. It combines multiple instances of single decree paxos to take decisions over mulitple value.

https://blog.the-pans.com/paxos-explained/

Conclusion

In the next article, we will be seeing a more simpler Crash Fault Tolerant algorithem called RAFT which is build for understandability purposes and is a good point for people who wanna understand consensus. I didn’t start with it in this serie to give readers the opportunity to see how much consensus can get complicated and how it can also be as simple as it is with RAFT.

The link to the following part of the series:

References

--

--