Your Journey To Consensus (Part 2) — RAFT

Rachid CHAMI
4 min readOct 3, 2019

--

Like expected from the previous part of this series, this article will focus on RAFT as a consensus algorithm to establish consistency over a network of nodes that may fail.

Previous articles:

Raft

https://www.freecodecamp.org/news/in-search-of-an-understandable-consensus-algorithm-a-summary-4bc294c97e0d/

Raft is designed for understandability. It’s a consensus algorithm equivalent to multi-Paxos but still as efficient as it is.

Raft has several novel features:

  • Strong Leader: log entries only flow from the leader to other servers.
  • Leader Election: Raft uses randomized timers to elect leaders.
  • Membership changes: Uses a new joint-consensus approach where the majorities of the two different configurations overlap during transitions. This allows the cluster to continue operating normally during configuration changes.

What’s wrong with Paxos

Paxos, considering the way it was formulated in the first paper by Leslie Lamport, showed up many drawbacks afterwards:

  • It’s extremely hard to understand, even the experts suffer from its complexity.
  • Doesn’t provide good foundation to build practical solutions. Thus, companies end up building systems that are similar to paxos but are not the same, resulting in the following problems: Building time consuming solutions because Paxos doesn’t specify exactly the implementation details, Error prone solutions, unproven protocols because they will be solutions similar to Paxos, but with different approach which will result in new protocols that aren’t proved.
  • Paxos opaqueness, according to the team that made Raft, derives from its choice of the single-decree subset as its foundation.
  • Single-decree Paxos is dense and subtle: It is divided into two stages that do not have simple intuitive explanations and cannot be understood independently. This results in having issues developing intuitions about why the single decree protocol works.

How does Raft implements consensus

Raft implements consensus via:

  • Electing a distinguished leader.
  • Giving the leader complete responsibility for managing the replicated log.
  • The leader accepts log entries from clients then replicates them on other servers.
  • Tells the servers when its safe to apply log entries to their state machines.

Raft decomposes the consensus problem into three seperate problems:

  • Leader Election: a new leader must be chosen when an existing fails
  • Log replication.
  • Safety: no different command on same entry

To achieve this, Raft defines the following entities, each server must be one of these:

  • Leader: handles all clients requests.
  • follower: passive entities, they issue no requests on their own but simply respond to requests from leaders and candidates.
  • candidate: are used to elect a new leader

How does Leader election works

Leader election in Raft works as follow:

  • When servers begin, they start as followers and remain like that as long as they receive valid RPCs from a leader of candidate.
  • Raft uses a heartbeat mechanism to trigger leader election. In fact, a follower keeps on receiving ping messages from the leader as long as he exists. If he disappears, the follower waits for a certain amount of time then decides to be a leader itself.
  • To begin an election: a follower increments its current term and transitions to candidate state.
  • While waiting for votes, a candidate may receive an AppendEntry from another server claiming to be the leader. If leader’s term is at least as large as candidate’s term, then he accept that request and becomes a follower. If not, he continues his election process.
https://www.freecodecamp.org/news/in-search-of-an-understandable-consensus-algorithm-a-summary-4bc294c97e0d/

Conclusion

So, as promised RAFT is far more simpler than Paxos and requires less intuition to feel why it converges.

In the next article we will be seeing another familly of algorithms called the Byzantine Fault Tolerant algorithms. These latter have the ability to operate in a network that is, not just crash prone, but the nodes can act randomly, and even maliciously but still the network will still keep progressing.

Link to the next article in the serie:

References:

--

--