I Want Your Vote! (Oh Wait I Already Know It)
By Paul Madsen
At the heart of Distributed Ledger Technologies (DLTs) is the mechanism by which the different instances of the ledger are kept consistent, i.e. how do the nodes holding those copies come to a shared agreement or consensus on the ordering of the various transactions within the ledger.
A consensus mechanism will generally have the following steps
- Nodes create transactions they want recorded
- Nodes share transactions amongst other nodes
- Consensus is established on the order of valid transactions
- Nodes update their local state to the consensus result
The goal is to get to step #4 as quickly as possible, but do so while minimizing the chance of the process being disrupted, corrupted or manipulated. Generally, consensus models rely on a mix of democratic mechanisms, economic principles to encourage good behavior or discourage bad behavior & random aspects which together make it impractical or unpredictable for an attacker to achieve such manipulation.
Proof of Work (PoW) models like Bitcoin effectively run a contest to determine which node is given the power to make the consensus decision. The miner that first completes the hashing puzzle wins the right to add its block. Other nodes effectively endorse that block by subsequently adding their blocks to that chain. Proof of Stake (PoS) models like Ethereum’s Casper proposal layer on economic incentives — requiring that those nodes that would participate in consensus make a deposit of coins as a surety on their good behavior. Leader based models like Paxos & Raft rely on a specialized node (that may change over time) to unilaterally decide the consensus result.
Voting models for consensus have desirable security properties (namely that they can be asynchronous Byzantine Fault Tolerant and achieve consensus even when some nodes are malicious and some messages are significantly delayed) relative to the consensus models listed above. On the other hand these voting models generally require that a large number of messages be sent amongst nodes in order to get to consensus. Consequently, voting models have been seen as impractical for any real deployments.
The inefficiency of voting models derives from the necessity of multiple messages being sent between nodes to tally up the votes and so determine consensus. But consider what would happen if this weren’t necessary, that each node would know what other nodes had voted, without actually having that other node share its ballot?
What if voting were ‘virtual’, i.e., that nodes were not only able to determine what their vote on a given question will be, but also what other nodes would vote on the same question? No longer would Alice need to tell Bob her vote, Bob would be able to put himself in Alice’s shoes and work out what her vote would have been if she actually cast the ballot and told Bob.
The situation is similar to a political election. The candidates could save themselves some time and effort on the day by simply indicating to some volunteer at the polling station ‘You know how I’m going to vote, do it for me’. (though it is admittedly hard to imagine any politician being willing to forgo the photo opportunity of casting their ballot.) In fact, we could imagine all citizens voting virtually in this manner — every citizen would calculate how every other citizen would vote, and after adding up, they would all come to the same answer.
A virtual vote to establish consensus on the order of transactions obviously requires that Bob is able to have the same view as Alice of the transactions. Without that view Bob would be unable to put himself in Alice’s shoes and see the world from her PoV — and so be able to determine what Alice’s vote would be.
The hashgraph is a data structure that provides exactly that shared view of a set of transactions. But critically, not only does it provide Alice & Bob an identical view of the transactions, it gives them an identical view of how each of them (and all other nodes) learned of those transactions. The hashgraph provides each node a consistent history of exactly how each node has talked to others, and in what order they have talked, and exactly what they talked about. A hashgraph is a record of the complete history of how all nodes communicated, including what transactions were communicated at each step. With Alice and Bob both creating their own local copy of the hashgraph (built up over time via the nodes gossiping amongst themselves), both are able to see all the data & metadata they need to be able to cast their votes, but also cast a virtual vote on behalf of the other. No longer does Alice need to tell Bob how she voted, Bob just looks at his own copy of the hashgraph (which he knows Alice also has) and determines how Alice would vote (if he asked her).
With virtual voting based on the hashgraph, we benefit from the great security characteristics of voting models, but without the associated messaging overhead. Constructing the hashgraph requires not only that nodes gossip about transactions, but also gossip about how and with whom they previously gossiped. I’ll go into greater detail on this ‘gossip about gossip’ in a subsequent blog post.