Binary Merkle Trie
There’s a SPECTRE on the horizon, looming over the Hycon blockchain like an intangible shadow, with its promises of vast improvements in both throughput and speed.
Before we can escape from that shadow, there are some preliminary foundations that need to be laid to support this large upgrade. The foundations are twofold, migration of the codebase to a low level language to maximise performance; and improvement of the actual data structures in use on the Hycon blockchain.
This post is going to go into some detail about the second of these foundations.
Preliminary — Merkle Trees & Merkle Patricia Tries
A merkle tree is, quite simply, a hashed binary tree, where every node is the hashed representation of its two child nodes. This structure is useful in blockchain as it allows for easy validation of the expected transactional content of a block. The transactions are typically put into a merkle tree with the root included in the block.
A Merkle Patricia Trie is a more complicated structure, with the version used in Ethereum being hexadecimal in nature rather than binary. An advantage of this structure is that unlike a standard merkle tree it does not require balance in the overall number of nodes, and that it also allows for compression of paths in areas of sparse data. In Hycon we found that our use of this data structure was one of the major system bottlenecks and so we sought to improve on it, leading us to some research on a data structure called a sparse merkle tree. However after looking at the problem for some time, we decided on a hybrid approach that takes some of the path compression used in a Patricia Trie and the speed of access of a binary tree.
The development process for this data structure began in September and at the time there did not appear to be anything similar in use barring a few POCs of sparse merkle trees, however a quick Google before writing this piece led me to this by Aergo, a state trie that looks remarkably similar to what we have just built. It’s quite interesting to see that two projects looked at the same structure, and came up with remarkably similar solutions to its shortcomings. I guess great minds think alike.
HYCON Binary Merkle Trie
The key to how this structure works is how inserts to the trie are performed. The insert process follows the paths set out by the keys, which are split into paths by the requisite bit at each node until a dead end is reached and a new leaf node is created. Any path not travelled is added to the set of Proof Nodes, which are the nodes required to generate a tree combining the unchanged data and the new data.
The set of tree nodes then becomes the concatenation of the inserted leaves and the proof nodes, which are then sorted by key. However this is not sufficient alone, we also need a mapping of where the adjacent keys split from each other to allow us to know at what depth in the tree they should be hashed together. In our implementation we call this the split index, and in a fully filled tree this would be the same as the depth, however in a sparse structure with branch compression permitted this is not the case. The following example illustrates this process more clearly.
An Example of Branch Compression and Extension
Take three keys with binary representation, obviously real keys will be longer than 1 byte, however for illustrative purposes these short keys suffice.
A = 01101101, B = 11011011, and C =11001011
These keys when inserted into our trie will produce the following structure
What happens if we add an extra key E = 10011011 here?
We first get our root node which gives us the leaf node A and the Branch node D. The path to E does not include A, therefore we include it as a proof node for our updated tree. However the path to E also does not include D, as E branches off from the path before reaching D, which means that a new branch is required above the position of D in our tree. D for the same reason as A before it, is added to our proof nodes as a path not taken. When we sort our keys this gives us an ordering of A, E, D, however we still need to determine the order of hashing.
As stated before we calculate the split index between adjacent sets of keys to determine the order of hashing. It is important to note that split index is transitive for all nodes in a particular path with respect to any other node not in that path.
To take the example of node D here. The path to D and any nodes in Subtree(D) will always begin with 11… thus we can infer that split index of between any node X not in the path to D and D is equal to the split index between X and any representative node in Subtree(D).
SPLIT_INDEX(X,D) == SPLIT_INDEX(X,d)
where d∈ Subtree(D)
The split index is the bit at which two keys diverge. In the case of these three nodes, A diverges from E at the first bit, which we designate as a split index of zero. E diverges from D at the second bit, giving a split index of 1. We start the hashing procedure at the pair of nodes with the highest split index, i.e, the nodes that are adjacent to each other at the deepest level in the tree and work our way up. With our aforementioned relationship for nodes that are not in the same path we can see that:
Split_Index(A,F) == Split_Index(A,E)== Split_Index(A,D)
We can infer a hashing order of F = Hash(E+D) and then Root = Hash(A+F)
As we move to larger scales, we believe that this improved data structure which is to initially be used to replace our merkle patricia state trie will give us the boost required to be able to scale efficiently to thousands of transactions per second in a decentralised permissionless environment.