On Safety and Liveness Trade-offs in Consensus Protocols

4 min readMay 23, 2019

I wanted to (very) briefly discuss how consensus protocols can trade-off safety and liveness.

First, a brief primer into the two major classes of consensus, longest-chain and voting-based (quorum-based) protocols. Longest chains protocols work under the premise that the “true” chain (the globally distributed ledger) can be identified via the “heaviest-chain” rule, which is really a function that is defined differently based on the underlying consensus mechanism. Some longest-chain consensus protocols may use hash-power to attribute weight. This is what Nakamoto (Bitcoin) does. Due to some mathematical artifacts of their constructions, longest-chain protocols can only be safe and live if and only if a) we have a strictly synchronous network and b) the Byzantine presence is less than 1/2 of total network. This upper bound turns out to be fundamental, and can’t be broken. So, for example, Bitcoin is entirely unsafe under asynchrony. Voting-based protocols, on the other hand, work under a very different mechanism, which happens to be voting. There’s no weight and no longest chain, instead we simply finalize if we achieve a minimum required quorum. So while voting-based protocols work quite differently from longest-chain protocols, they similarly have fundamental limitations: in synchronous and asynchronous settings, they provide liveness and safety if and only if Byzantine presence is less than 1/2 and 1/3 of network size, respectively.

This now brings us to the main topic of this post (and which seems to be unfamiliar with the community): these bounds are trade-able.

To demonstrate this briefly, I will focus on voting-based protocols in an asynchronous setting, whose bound is 1/3. Suppose that I have 100 nodes in the network. Then, I can construct a safe and live protocol if Byzantine nodes are at most 33/100, which means that I require a quorum of 67 nodes or more to agree on the same thing. To see why this works: let Byzantine nodes be 33, and correct nodes be 67. The Byzantine nodes can split the correct nodes into two groups of 34 and 33 nodes. Then, they can lie and create two groups of 67 and 66 nodes that agree on the same thing (34 correct + 33 Byz and 33 correct + 33 Byz, respectively). However, since quorum is at 67, only one of these two quorums can ever “win out” (the first one). And in fact, since Byz ≤ 33, and my quorum is at most 67, even if Byzantine nodes always just say random things, we can ignore them and have correct nodes still agree (note: this hides a lot of complexity on how to get them to actually agree).

But, suppose I want higher tolerance for Byzantine nodes in regards to safety. Then I can request higher quorum intersection, say 80 nodes (instead of 67). What does this mean? It means I can be safe even if Byzantine nodes are higher than 33 (as high as 59), but not be live. To see why this is the case, suppose Byz is 59, and correct is 41. Then, the Byz nodes can again split the correct nodes into two groups of 21 and 20, and create two quorum of 80 and 79 (59 Byz + 21 correct and 59 Byz + 20 correct, respectively). But, since quorum is minimum 80, again *only one of these two can win out*. Conversely, however, since the quorum is 80, I can only tolerate at most 21 Byzantine nodes that just say random things and prevent us from making progress. In summary: I can boost my quorum to 80/100, and get safety for at most 59/100 Byzantine nodes and liveness for at most 21/100 Byzantine nodes.

Now, this takes me to the final part of this post, which is in regards to the Avalanche consensus protocol. Avalanche consensus takes classical quorum-based voting protocols and makes them probabilistic. At a super high level, the subsampling of the core primitives buys you huge performance gains, but it also means that the bounds above are dampened. So, for example, if you parametrize the system for a quorum of 67%, instead of getting both safety and liveness with an adversary of 33%, you get slightly less (say around 30%, under some choice of sub-sampling and probability of failure). Avalanche chooses a purposefully higher quorum threshold (80%, minimum). The liveness bound, as above, is maximally about 20%, but due to dampening it is slightly lower (about 16%).

On Safety and Liveness Trade-offs in Consensus Protocols

Written by Kevin Sekniqi