The consensus mechanism is the cornerstone of distributed technology, such as the blockchain. Since the decentralized nature of systems based on such technologies does not provide for a single center to validate transactions, we must have a mechanism that, on the one hand, allows us to reliably perform this function; and on the other, has a distributed nature. In other words, such a mechanism should provide confirmation of each transaction’s validity by reaching an agreement between all network participants (or rather all active participants — full nodes) on this issue, providing protection against a possible double-spend problem, and ideally providing the immutability of blockchain’s distributed ledger.
There are many variations of such mechanisms, and they have also evolved in some ways. We are starting a series of articles describing the main stages of this evolution: from classical methods of achieving consensus in distributed databases, to Nakamoto consensus mechanisms used in modern decentralized blockchain systems, and finally to sharding and local consensus, which becomes the foundation principle of some Layer 2 and Layer 3 multi-hop projects working on the basis of state channels and trustlines.
Classical consensus methods
Classical consensus methods emerged long before the advent of blockchain technology in its modern form, back in the 1980–90s. Since inception, these methods have mainly been used in distributed databases of varying degrees of decentralization.
Byzantine Fault Tolerance (BFT)
In the context of distributed systems, Byzantine Fault Tolerance is the ability of a distributed network to function as expected and to achieve sufficient consensus, despite the fact that malicious components of the system may not work correctly or intentionally spread incorrect information to the rest of the network participants.
The goal is to protect against catastrophic system failures, reducing the impact that these malicious nodes have on the proper functioning of the network and the accurate consensus that honest participants in the system achieve. Originating from the problem of the Byzantine generals, this dilemma has been thoroughly researched and optimized with a diverse set of practical solutions and remains in active development.
Several decisions were described by Lamport, Shostak, and Pease back in 1982. They began by saying that the “problem of the Byzantine generals” can be reduced to solving the problem of “Commander and Lieutenants”, when all loyal lieutenants should act in unison and that their actions should be consistent with what the commander ordered, provided that the commander himself is loyal.
In the 1980s, several system architectures were developed in which BFT was implemented. These include Draper’s FTMP, Honeywell’s MMFCS and SRI’s SIFT.
Practical Byzantine Fault Tolerance (pBFT)
One of the classical methods of reaching a consensus (which arose as a theory even before the appearance of today’s most popular Proof of Work method as described later) is this practically improved version of the original BFT. The principle of operation is based on active communication between nodes and the assumption that the number of malicious nodes in the system cannot simultaneously exceed ⅓. At the same time, the system guarantees stable operation, provided that the percentage of “bad” nodes remains under this value.
The advantages of this consensus method over others are that agreement on one or another block is achieved without the need for confirmation, as in Proof of Work for instance. If the agreement between pBFT nodes upon the state of the proposed block is reached, this block is considered to be final. This is made possible by the fact that all honest nodes agree on the state of the system at the particular moment as a result of active communication between them.
Since the communication between the nodes requires much less computational resources and energy than the execution of complex calculations (as occurs in the case of a PoW method), the second main advantage of this method of achieving consensus compared to the PoW is energy efficiency.
There are, of course, disadvantages as well. In particular, this method in its classical form is more suitable for small networks because of the significant amount of heavy communication that is needed between the nodes (in large networks, like modern cryptocurrency networks, this can be a problem). Also, the classic pBFT model is subject to a “Sybil attack”: one side can create and manipulate a large number of virtual identities (nodes in the system), thus compromising the entire network.
However, there are many improvements to the classic pBFT model aimed at overcoming these and other shortcomings (see below), or one can use the pBFT method in combination with others.
Other variations of the BFT
Today, there are a number of blockchain platforms that use optimized or hybrid versions of the pBFT algorithm as a consensus model, or at least parts of it, in combination with other consensus mechanisms.
After pBFT, several BFT protocols were developed in order to improve its reliability and performance. For example, protocols such as Q/U, HQ, ABsTRACT, Zyzzyva, etc. address issues of performance and costs.
In particular, Zyzzyva applies a highly optimized version of the classic pBFT in conjunction with a consistent round of Proof of Work approximately every 100 blocks. This method uses multi-signatures to reduce the communication costs of the classic pBFT, and in its own testing environments it has reached TPS of several thousand, hoping to scale even more as more nodes are added. It is also a direct result of this implementation of pBFT in the segmentation architecture (sharding, which we’ll cover later), meaning that the pBFT consensus groups remain smaller within certain segments, thereby maintaining a high throughput mechanism while limiting the size of the consensus group.
Meanwhile, other protocols such as Aardvark and RBFT, have solved reliability problems. In addition, Adapt attempted to use existing BFT protocols by switching between them in an adaptive way in order to improve the reliability and performance of the system when base conditions change. In addition, BFT protocols have been introduced that use trusted components to reduce the number of replicas, for example, A2M-PBFT-EA and MinBFT.
Raft stands for Reliable / Replicated / Redundant And Fault-Tolerant. This algorithm was developed by researchers at Stanford University. Unlike BFT, Raft is a CFT consensus (Crash Fault Tolerant), because here it is assumed that the leader always acts honestly. All followers blindly replicate the entries proposed by the leader, no questions asked. If a leader fails, the rest of the network will automatically select a new leader after a predetermined period of time has passed, and the network continues to function. When the failed (ex-leader) node is restored, it becomes a follower and starts replicating the missing blocks offline.
Blocks in Raft consensus are not created if there are no transactions in standby mode. This can lead to significant memory savings, especially with a low number of transactions, since empty blocks containing zero transactions are not processed.
Another advantage of using Raft is a faster blocking time than, let’s say, IBFT. The leader creates a block within 50 ms after receiving the transaction (as configured by default), and sending the proposed block through the Raft cluster and collecting most of the confirmations is a very fast process. Under most typical network conditions, the average block creation time is usually less than a second.
However, blocks created in a Raft consensus network are protected neither by a unique hash, as in Proof of Work, nor by cryptographic signatures, as in other consensus algorithms such as BFT. As a result, distributed table data can be quite easily re-written by changing historical data (transaction input) and recalculating the block hash (as well as other relevant fields that need to be consistent, such as a transaction tree, etc.). For the correct operation of the network based on Raft, other means of data protection may be required to ensure the integrity of the blockchain data. Overall, the Raft consensus algorithm is more suitable for distributed databases with a low number of transactions.
In the next part, we will look at the Nakamoto consensus used in modern decentralized block systems.