Tenderand: Randomized leader election in Tendermint
Tendermint is the most popular and widely used consensus algorithm for PoS-based blockchains. The popularity comes from the simplicity of the system. When choosing the proposer for a certain round, a committee simply utilizes a weighted round-robin method. With this method, any validator can easily calculate the eligible proposer for a given round.
However, even malicious entities who are not validators can easily calculate who will be the next proposer. For these malicious attackers, targeting all their bots on the predetermined proposer is a piece of cake. Since every round has a unique validator, block generation may be halted by attackers who attack validators throughout the Tendermint rounds. One possible solution to the issue is randomizing the leader election process. This idea was already filed as an issue in the Tendermint repository, but did not come to fruition.
Even if randomized leader election is adopted, it must not break the stake-proportional leader election property. Furthermore, proposals from other validators should be verifiable. One simple probability model satisfying this condition was proposed by Algorand. As a verifiable random number generator, they utilized the VRF(Verifiable Random Function), and the random values from the VRF determine sortition results. The process is considering each node’s stake delegation in order to obey the stake-proportional property and also to prevent Sybil attacks.
However, this simple probability model has a limitation. The nature of this model is unable to rule out the situations in which no user is eligible to be a proposer. The no-proposer case directly implies consensus failure in the round, but does not defy safety and liveness properties. We thought reducing the probability under a certain level is enough. Both the Tendermint consensus algorithm and Algorand’s sortition algorithm are strong due to their simplicity. We combined these two schemes to construct Tenderand. The original proposal can be found on the CodeChain research page.
Implementation
Implementation was a three step process. First, implement a secp256k1 compatible ECVRF. Second, implement a cryptographic sortition algorithm in Rust. Third, insert the sortition process into Tendermint’s “propose” step.
Since the “hash-to-curve” method is the only curve-specific part among the VRF component functions, the first step involved following the “draft-irtf-cfrg-hash-to-curve-05”. There is a recommended hash-to-curve method supporting our secp256k1 curve in the draft. The name of the method is SVDW(Shallue-van de Woestijne). Not to leak any information from computation times, the “hash-to-curve” methods consist of constant-time operations. We followed the recommended way described in the draft and some unresolved problems are filed as issues.
The second step was easy because the probability model is so simple that importing binomial cdf function was enough. You can find the code in the CodeChain VRF-election branch.
The third step is the most important part. We already have a well performing Tendermint implementation so I didn’t want to break the correctness and liveness properties. In the previous vanilla Tendermint, only a proposal signed by the predetermined block proposer can be accepted as a valid one. Now, any node that can prove it is eligible to be a proposer can propose a block and other nodes are able to verify the proposal. This is considerably a substantial change.
There may be some concerns regarding the change. What if a correct node fails to observe the round’s highest proposal, resulting in a vote for the second-highest proposal in a round? The same question is also possible in vanila tendermint. How if a correct node fails to observe the unique valid proposal in a round even though the correct proposer broadcasted its proposal? That should be guaranteed by the gossip communication property. If it fails, it implies a system parameter failure, not a failure in the consensus algorithm itself, and the property will be recovered by increasing the timeout duration. The nodes that fail to receive the valid proposal will vote for Nil in vanila Tendermint. If we consider the valid proposals except for the highest proposal as “Nil”s, the previous behavior does not change.
There is also another possible concern. In the previous Tendermint, proposals can be validated immediately after the proposals are received, but now they can’t. Yes, now proposals can be validated only after the “propose” timeout. Although the timing is delayed,validation will be taken care of by the gossip communication property. Furthermore, the change does not defy the correctness and liveness properties.
Experiment & Result
From CodeChain binaries with randomized leader elections turned on, we organized the local network with a total of six nodes. The Propose timeout was loosely set to 4000ms to mitigate system parameter failures. This parameter defines the proposal exchange time between nodes. Every node will pre-vote for the highest priority proposal it has ever seen within the timeout period. For sortition parameters, I normalized each node’s stake, treating the total as 1000 and set the expected elected_sub_users as 7.0. Lastly, I produced the consensus and sortition data for 99829 blocks. In the first case, delegations are equal between the nodes.
In the ideal environment, each node’s frequency is expected to be equal. From the viewpoint of a normal distribution, each node’s election frequency follows N(np, np(1-p)) where the probability of being the leader of a round is 1/6. The expected frequency and the standard deviation are 16638 and 117.8 respectively. Ideal values are included in the 95% confidence interval, which is marked by the above error bar.
In the second case, the delegations vary between nodes. For nodes from node 1 to node 6, the delegations are 100,000, 200,000, 300,000, 400,000, 500,000, 600,000 respectively.
Also in this case, election frequencies are expected to follow the normal distribution. Ideal values are included in the 95% confidence interval, which is marked by the above error bar. The result is statistically reliable to be interpreted as random.
The average number of elections also followed the initial theoretical parameters. The average number of elected sub-users and the standard deviation was expected to be 7.0 and 2.637 respectively. In the uniform delegation case, the average number of elected sub-users per round is 6.96, and the standard deviation is 2.631. In various delegation cases, the average number of elected sub-users per round is 6.98 and the standard deviation is 2.630.
Conclusion
We examined how two simple systems can be combined to produce a randomized leader election based Tendermint implementation. We also showed that the random quality of our VRF implementation is comparable to “dev/urandom”’s. Therefore, the secp256k1 based ECVRF is favorable to be used in randomized leader elections. With Tenderand, we can effectively defend against DOS attacks in public blockchains.