Building with HashGraph Part 1: Introduction
The HashGraph is a super cool technology that claims to be the next generation of distributed ledger technology, i.e. ‘Blockchain 2.0’. First conceptualised by Leemon Baird in 2015, HashGraph is still a very young technology but it’s been gaining a lot of attention in the last few years. Using the concept of ‘gossip’, the HashGraph promises superior performance, security, stability and fairness. Additionally, the Hedera HashGraph is an in-development platform that leverages HashGraph technology to create a virtual community with a native cryptocurrency, distributed applications, smart contracts and ‘fair’ governance. Whilst there’s been a lot of criticism of the patents that restrict the use of the HashGraph, there is an SDK which is available for developers to play around with and the potential for HashGraph applications in the future is huge. In the next two blog posts I will give an introduction to the HashGraph and then work through the SDK so you can start playing around with your own toy HashGraph applications.
If you’d like to skip ahead, here’s a link to Part 2: SDK workthrough.
How HashGraph works
Warning: I’m about to present some pretty crazy ideas but I’m not going to dive into them in massive detail. If you get scared at any point while reading this then don’t worry — it’s absolutely fine to skip this post and go straight into the ‘Properties’ section or even the next blog post. It’s more important to know the properties that HashGraph guarantees than how it guarantees them.
HashGraph works on a gossip based protocol which is fundamentally not a new idea. In gossip based systems, nodes communicate with each other and tell each other information that they know. For instance, this could be a history of transactions which each node recognises. In this case, nodes that want to add transactions will put them in their local history and randomly ‘gossip’ to multiple other nodes in the network. Upon receiving this gossip, these nodes will then gossip this information out to other nodes, and so on. Ultimately, information spreads to an exponentially increasing number of nodes and even if a few adversarial nodes choose not to gossip, they cannot prevent the exponential spread of information by the good nodes.
The innovative idea that distinguishes HashGraph from traditional gossip systems, is that rather than just gossiping arbitrary data, nodes gossip about gossip. I.e. the data structure in the image on the left is stored in memory and sent to others. Intuitively, if you imagine each node in the diagram to represent a new transaction being added to the graph, then rather than just sending that node, when a host gossips, they will send a whole tree of nodes starting from the new one.
So how can we achieve consensus? Usually in systems, if two nodes add a transaction at the same time, then we have a ‘collision’. Often the ordering of the transactions in a collision is decided by a complex scheme of voting (cf. Paxos or Raft). Unfortunately this requires complex combinations of node states with high network traffic too. Alternatively Bitcoin uses Proof of Work and sidesteps this issue by making sure that blocks are added very seldom so such collisions are unlikely and when they do happen, the community will choose a block arbitrarily, prune the other block and move on. HashGraph achieves consensus by running a process called virtual voting where a node can decide whether or not a transaction is agreed upon by voting on behalf of all the other nodes. In a network with participants Alice and Bob, Alice never asks Bob to send her a vote because she will work out his vote for him. This is possible because all nodes know what other nodes know and can therefore speculate what the node would have voted for without actually transferring a single packet over the network. This not only eliminates the network traffic of voting but it makes it impossible for nodes to vote adversarially because they never actually vote!
Like I said before, there are some pretty crazy ideas here and it will take some time to wrap your head around them, so you now have one of two options:
- Option 1: read through the HashGraph paper to get a deeper understanding of how HashGraph works. The science behind this is super interesting and the paper is very accessible too so if you are scientifically inclined I can highly recommend this.
- Option 2: take my word that this is a viable approach to consensus and just use the term ‘gossip about gossip’ to impress your friends. If you’re not scientifically inclined then this is a perfectly acceptable approach and isn’t going to stop you understanding the rest of these blogs. It’s arguably more important to understand the properties of the HashGraph than to understand the way it achieves these properties.
Whatever you’ve chosen to do, hopefully I still have your attention so now we can move on to discussing the properties of the HashGraph and what makes them so special.
Properties of the HashGraph
So what properties can the HashGraph guarantee that something like Paxos or Proof-of-Work+Blockchain can’t?
HashGraph has shown some outstanding performance statistics in some of the current tests. Evaluating a distributed system is hard and there is no single number that can be used to assess a systems performance so for a detailed breakdown of performance you should see the whitepaper. If that seems a bit tedious to you then suffice to say that HashGraph can achieve throughputs of up to 1m transactions per second and latencies of less than 10s which is extremely impressive. In contrast, Bitcoin has latencies of at least 10 minutes and throughputs of around 7 transactions per second and Ethereum has latencies of around 15s and throughputs of around 15 transactions per second.
So how is this performance achieved? HashGraph is able to achieve superior performance because it eliminates a lot of the sources of wasted work that you see in other consensus schemes such as proof of work:
- Mining — In Bitcoin, each node needs to be capable of performing many redundant computations to try to win the ‘Proof of Work’ lottery. This wastes a huge amount of energy which is both bad for the environment and also means most nodes are high performance, expensive mining rigs. HashGraph doesn’t use this process and this means that nodes can be any sort of cheap, readily available hardware like a phone, laptop or RaspberryPi.
- Pruning — The HashGraph is also more efficient than Bitcoin in that no transaction is ever discarded. Earlier I mentioned how Bitcoin occasionally has to choose between two different blocks which are mined at the same time. This means all the computational work that was used to put together the unused block is discarded. HashGraph uses a voting scheme that means if transactions collide, they will be ordered by consensus but not discarded. This means less work is wasted!
- Network — In any consensus scheme, there is a lower bound on the amount of bandwidth required which is limited by the fact that all transactions need to be broadcast. HashGraph only adds a very small amount of bandwidth on top of this limit because the HashGraph data structure which is transmitted has a size which is only slightly greater than the size of all the transactions. I.e. bandwidth is not wasted on large data structures or sequences of votes.
- HashGraph uses cryptography to ensure that transactions are secure and can be validated by other participating nodes.
- HashGraph also guarantees Asynchronous Byzantine Fault Tolerance — don’t worry I know it’s a scary term. This basically means that so long as 2/3 of the network are legitimate, then colluding adversaries will never be able to stop or alter the final consensus reached. This is pretty much a gold standard for distributed systems.
- Fairness refers to time-stamping and assigning a consensus order to transactions. The timestamps which transactions are assigned (after they are agreed upon by consensus) reflects the median time that nodes received the transaction. This means that if one adversarial node delays the receiving of a transaction by a really long time, it will barely effect the consensus time stamp. Therefore, transactions will be assigned fair timestamps and fair ordering.
- Fairness also refers to how transactions can’t be prevented from entering the system by any adversarial node. This is because if a node gossips to an adversary that doesn’t pass on information about some transactions, this wont matter. This is because the node will soon after gossip to a genuine node due to the random nature of gossip, and then the information will be passed on.
In summary, HashGraph is a gossip based protocol that transmits ‘gossip about gossip’ to allow nodes to run virtual voting without ever sending voting messages over the network. This simple scheme can guarantee security, fairness and high performance. In the next post I will be showing how you can get started with the HashGraph SDK. Knowing how the HashGraph works will be helpful for this, but not necessary.
Here’s the link to Part 2…