Forming Consensus with Byzantine Fault Tolerance

Published in

ppio

6 min readOct 18, 2019

A look at how Byzantine Fault Tolerance works when applied to Tendermint by Cosmos.

In our discussions on technology, both our own and technology that’s impacting the digital world and blockchains, we have discussed how different chains can communicate with each other. At PPIO we’re paying close attention to innovators and pioneers in the field in order to improve our own technology. Cosmos’s Tendermint is one we’re paying close attention to having introduced them as part of our Code Talks series. Our article today will analyze perhaps the most complex and important aspect of Tendermint — their consensus algorithm.

Distributed consistency algorithms can be generally divided into two types: those that use Byzantine Fault Tolerance (BFT) and those that are Non-Byzantine Fault Tolerant. Non-Byzantine Fault Tolerant algorithms such as Paxos and Raft have been widely used in distributed systems, whereas the practical application of BFT is relatively small (especially before the advent of blockchain). Tendermint uses a BFT algorithm but optimizes the traditional practical Byzantine Fault Tolerance (pBFT) algorithm by only needing two rounds of voting to reach a consensus. At present, this modification to BFT is used in the Tendermint algorithm which is mainly used in a blockchain system.

Round-Based Protocol

There are three types of voting in Tendermint: prevote, precommit, and commit. When a block is committed by the whole network, it means that the block has been signed and broadcast by more than 2 / 3 of that network’s validators.

The vote data structure is as follows:

When the chain reaches a new height, the system runs a round-based protocol to determine the next block. The round-based protocol consists of three steps: proposal, prevote, and precommit; and the addition of two special steps: commit and NewHeight. Among them, proposal, prevote, and precommit will take up ⅓ of the round time. The time per round will be a little longer than the previous round. This is to make the network reach a consensus in the case of partial synchronization.

The round-based protocol runs as follows:

The round-based protocol is a state machine, which consists of five states: new height, proposal, prevote, precommit, and commit. Each of these states is referred to as a step. The first and last steps, new height and commit, are called special steps, while the remaining three steps in the middle are called a round, which is the consensus stage and the core principle of the algorithm. The final commit of a block may require multiple rounds. Repeated rounds may occur due to either the block node being offline, the proposed block being invalid, the number of prevote or precommit votes received is less than ⅔ of the total, etc. In these cases, the solution is to move to the next round or increase the timeout time.

When the blockchain reaches a new height, it enters the new height stage. The next step involves the submission of a proposal which will occur in the propose stage. Following that, the prevote stage is entered where a prevote will be taken once the proposal is received. In the following precommit stage, if ⅔+ consensus is achieved, the commit stage will follow. If ⅔+ is not reached, then the proposed stage is repeated until consensus is formed. Once in the commit stage, the new block is committed.

The above is the overall process of the algorithm operation. The stages involved are explained below in more detail.

Proposal

A proposer will be selected by round-robin strategy before each round, and the selected proposer will submit that round of proposal. For a detailed insight into the selection rules of the proposer, please check our definite article on the subject: Tendermint Introduction and Analysis.

The data structure of the proposal is as follows:

The first step of Tendermint’s round-based protocol is to propose. At the beginning of the proposal, the selected proposer will broadcast a proposal to the whole network. If the proposer is locked on the block from the previous round, the proposal initiated by the proposer in the current round will be the locked block, and the proof-of-lock field will be added to the proposal.

Prevote

At the beginning of the prevote stage, each validator will determine whether it is locked in from the previous round of the proposed blocks. If it is locked in from the previous round, it will continue to sign and broadcast the previous prevote vote. Otherwise, the proposal block received in the current round is signed and the prevote vote is broadcast. If for some reason, the current validator does not receive any proposed blocks, then an empty prevote vote is signed and broadcast.

Precommit

At the beginning of the precommit stage, each validator will determine that if more than ⅔ prevote votes are collected, then the block is signed and the precommit vote is broadcast. The current validator will also be locked onto the current block, and the previously locked block will be released. If a validator collects a vote for more than ⅔ of the empty blocks (NIL), the previously locked blocks are released. The validator, who is now locked, will collect the prevote votes for the locked block, and package these votes into the proof-of-lock which will be used in the proposal phase. If a validator doesn’t collect more than ⅔ of the prevote votes, it won’t lock on any blocks.

It’s important to introduce: Proof of Lock Change (PoLC). This refers to the prevote voting set whose height and round number exceed ⅔ of the summary point for either a block or nil (an empty block). In short, PoLC is the prevote voting set.

Later in the precommit phase, if the validator collects more than ⅔ of the precommit votes, then the validator enters the commit phase. Otherwise, another proposal round will occur.

Commit

The commit phase is divided into two parallel steps:

The validator has received the block that has been committed by the whole network, then will broadcast a commit vote for this block.
The validator needs to collect more than ⅔ commit votes for blocks precommitted by the whole network.

Once both conditions are met, the node will set the commit time to the current time and enter the new height phase.

At any stage of the whole consensus process, once a node receives more than ⅔ of consensus, it will immediately enter the commit stage.

Why Doesn’t A Fork Occur?

Assume that, at most, less than ⅓ of the voting power of validators is byzantine. If a validator commits block B at round R, it’s because it saw ⅔+ of precommits at round R. This implies that ⅓+ of honest nodes are still locked at round R’ > R. These locked validators will remain locked until they see a PoLC at R’ > R, however, this won’t happen because ⅓+ are locked and honest, so at most less than ⅔ are available to vote for anything other than B.

In other words, Tendermint doesn’t fork because it uses a BFT consensus, along with a PoLC.

Closing

We’re excited by Tendermint and what we can learn from it. We hope that our examination of the process of the round based protocol, and the specific implementation steps help shine a light on how consensus using BZT is formed.

Want more? Join the PPIO community on Discord or follow us on Twitter.