The following is my interpretation of trie database structures and their implementation within the Ethereum network. I’ve also included a glossary of keywords used throughout the article. Enjoy!
Hello, avid learners and blockchain enthusiasts!
The core data structures that exist in Ethereum consist of an optimized implementation of Merkle Patricia tries. This article aims to provide readers with an improved understanding of how Ethereum’s data storage layer operates and the different intricacies that make up the ‘world state’. Specifically, I’ll be discussing different trie structures, their beneficial properties, Ethereum’s modified Merkle Patricia trie, the different tries within Ethereum, and their corresponding purposes. In addition, there are helpful exercises and resources at the bottom of this article to solidify a reader’s understanding.
Trust me, this will be fun and informative. :)
Before we discuss Ethereum’s modified Merkle Patricia trie, let’s get a brief introduction of the varying tries that make up the unique properties of Ethereum’s implementation.
Trie Data Structures | Radix Trees
The origins of the term trie comes from retrieve; for clarity’s sake, article the terms “trie” and “tree” will be used interchangeably. According to Paul Black , the use of the term trie was simply to distinguish it from the word tree, but there will be no such distinction made in this article…sorry, Paul. (Didn’t I tell you this would be fun?) Alright, let’s get serious.
Figure A. Source: https://en.wikipedia.org/wiki/Trie
A trie is a tree-like data structure, also referred to as a digital tree, radix tree or prefix tree, that is used to retrieve a string value by traversing down a branch of nodes that store associated references (keys) that together lead to the end value that can be returned . Specifically, if we begin from the top of the tree at the root node, each character in the key indicates which child node to follow resulting in a path to the corresponding value. The keys can contain N characters, thus each node can contain N children, and the max depth of the tree is the max length of a key . For example, if we were to use the English alphabet to comprise the keys, each node in the tree could have up to 26 children, as there are 26 letters in the alphabet. In order to reduce the time it takes to travel down a branch to get to the end value, keys that begin in the same sequence are grouped in close proximity to one another. In doing so, our search for the end value is optimized and more efficient. To emphasize this point, I will display a radix trie that does NOT group same sequence keys together and another one that does.
To illustrate, the image below depicts a dataset (Figure B) and an inefficient radix trie (Figure C). In order to retrieve the value 5 using our key BLOCK, we need to descend down the path five internal nodes with null values before reaching our desired value. A lot of wasted space is consumed with this inefficient implementation.
Figure B. Dataset for inefficient radix trie
Figure C. An inefficient radix tree implementation
An improved usage of the same radix tree with the same dataset can be made by combining same sequence keys. Rather than single character keys that represent the path to the value, we create a string. See Figure D.
Figure D. An efficient radix tree implementation
Instead of traversing down the path of five null value nodes, we merely have to go down one to get to our desired value of 5, and only two nodes to get to the value of 6, where the key is BLOCK(S).
Okay, so we’ve covered the basics of tries, specifically radix tree data structures. Let’s discuss Merkle trees.
Trie Data Structures | Merkle Trees
Merkle trees’ primary purpose is to prove the consistency of data, and is essentially a tree of hashes. The name Merkle Tree comes from Ralph Merkle, who patented the concept in 1979 . Wikipedia defines Merkle Trees more precisely as the following:
Figure E. Source: http://en.wikipedia.org/wiki/hash_tree
Let’s examine Figure E above to get a more concrete understanding of Merkle Trees. The bottom leaves (C,D,E,F) represent the hashes of the below data blocks, and these data blocks are the values in which we are storing. The parents of leaves ( [C, D] , [E, F] ) will be equal to Hash(valueOfChild1, valueOfChild2). For instance, node A is the concatenated hashes of its two children C and D. This process of re-hashing the concatenation of the child nodes to create the parent node is performed until the top of the tree is reached, called the ‘root hash’ . The parent-to-child hashing process/structure described above results in a key property of Merkle Trees: they provides a means to prove the integrity and validity of your data.
For example, if we were to change the value of a data block, the entire path leading to the ‘root hash’ would also be changed. Therefore, if we hold the value of the root hash, we could verify the consistency of data by rebuilding the trie to get the root hash and then compare it with the root hash value in which we are holding . It is impossible to fake data without changing the value of the root.
Let’s explore more benefits of Merkle Trees, as it will further our understanding of the rationale behind using tries for Ethereum’s ‘world state’.
Merkle Trees | Beneficial Properties
The key benefits of Merkle Trees consist of the following properties :
- Data Consistency / Verification
- Merkle Tree proofs are computationally easy and fast
- Merkle Tree proofs require only a small chunks of data to be broadcasted across a network
We briefly discussed how Merkle Trees are able to provide a means to prove the integrity of data, but let’s go a little bit deeper to accentuate the importance of data consistency as it relates to distributed and peer-to-peer (P2P) systems like Ethereum.
In these distributed systems, data exists in multiple locations; if some piece of data is modified in one location, those changes need to be reflected everywhere else. Merkle Trees are used to verify the data and make sure the data is the same — everywhere. Again, as we described earlier, we can do this by sending a hash for comparison. Consider the below sequence between two computers :
- Computer A sends a hash of the file to computer B.
- Computer B checks that hash against the root of the Merkle tree.
- If there is no difference, we’re done! Otherwise, go to step 4.
- If there is a difference in a single hash, computer B will request the roots of the two subtrees of that hash.
- Computer A creates the necessary hashes and sends them back to computer B.
- Repeat steps 4 and 5 until you’ve found the data block(s) that are inconsistent. It’s possible to find more than one data block that is wrong because there might be more than one error in the data.
The above sequence illustrates how Merkle Trees facilitate data verification across a network. To clarify, we are only sending the hash of the leaf that was identified as changed over the network. The importance of this is that it allows nodes in a distributed system to quickly and efficiently identify data that has changed without having to send over all the data in order to make the comparison. It’s possible to achieve data verification and consistency without the use of Merkle Trees, but to do so would be time-consuming and computationally intensive, requiring the network to check the entirety of each piece of data whenever nodes in the network request to verify data.
In Ethereum, merkle proofs make it easy to verify the inclusion of a transaction and enable SPV (simple payment verification). Furthermore, now we have the ability to utilize light clients on the network, as nodes are no longer required to a hold a copy of the entire chain. Light clients receive the block headers which contain a merkle root (more on this later) that can be used to query full nodes to verify if a transaction is included in a particular block. However, full nodes are still required to maintain a complete list of transaction and update their local state accordingly.
Sweet! So that was our introduction into trie data structures and their benefits, specifically radix trees and merkle trees. They’re pretty awesome data structures, and Ethereum managed to make them even better.
Ethereum’s Modified Merkle Patricia Trie | The Differences
Ethereum’s trie data structure implementation is different than traditional trie implementations, as modifications have been made to increase the performance and efficiency, hence Modified Merkle Patricia Trie. We will go over these improvements, including the encoding systems used by the Ethereum network, and also go over its special node types. However, let’s start with Ethereum’s own definition of its modified Merkle Patricia Trie from their wiki entry:
Merkle Patricia tries provide a cryptographically authenticated data structure that can be used to store all (key, value) bindings……They are fully deterministic, meaning that a Patricia trie with the same (key,value) bindings is guaranteed to be exactly the same down to the last byte and therefore have the same root hash, provide the holy grail of O(log(n)) efficiency for inserts, lookups and deletes, and are much easier to understand and code…
To summarize, tries are cryptographically secure, and each node is referenced by its hash, which is used for lookups in a Leveldb database (Go-Ethereum, CPP-Ethereum, Py-Ethereum) or RocksDB (Parity) . In this structure, root nodes become a cryptographic fingerprint of the entire data structure, achieving optimal efficiencies for inserts, lookups and deletes. These tries are stored in a Leveldb instance for Ethereum’s Go, C++ and Python clients. Below is a brief description of Leveldb and some of its features. (RocksDB is out of scope for this article as it pertains mainly to the Parity implementation of Ethereum.)
Leveldb is an open-sourced Google key-value storage library which provides a variety of cool features that include but are not limited to :
- Data is stored and sorted by keys
- Callers can provide a custom comparison function to override the sort order
- Ordered mapping from string keys to string values
- Multiple changes can be made in one atomic batch
- Forward and backward iteration is supported over the data
- Data is automatically compressed using the Snappy compression library
So how do you get to the values of trie data structures? Why, you explore the hex-prefixed (HP) encoded path from their respected root hashes, of course…
HP encoding is an Ethereum encoding system used solely on the paths that connect nodes of each trie data structure. This encoding system serves a dual purpose: to disambiguate between node types (in particular between leaf and extension nodes — more on those later) and convert the path to an even length . In doing so, the merkle tree will always be a whole number of bytes which is important for machines.
Let’s further our understanding of how HP encoding is utilized. To begin, Leaf and Extension are two distinct node types. Leaf nodes contain a terminator; think of it as a flag. The terminator is the last byte of the path and has a value of 16 in decimal or 0x10 in hex . A nibble, which is just one hex character, is appended to the front of a key to determine parity (even or odd in length) and if there is a terminator. More precisely, the lowest significant bit encodes the odd-even length, while the next lowest encodes the terminator. When the HP encoding function encounters a terminator it will remove it and then continue to create a prefix into the path. If the prefix is 0x0 or 0x2, the function will add a nibble of 0 followed by the prefix, thus the full prefix would be 0x00 or 0x20. Again, the reasoning is to maintain the even length of any given path.
Consider the table in Figure F: row B is an odd-length extension node beginning with hex 1, thus we prefix the path with 1 and now the path is even-length. Row C is an even-length leaf node; it has a terminator at the end (hex 10) and the path begins with the hex 0. The terminator is removed, and hex 2 is prefixed in front (20) followed by hex 0 in front of hex f, creating an even-length path.
Figure F. Hex-Prefix Example
While HP encodes the path of a given value, Ethereum’s own encoding system known as Recursive Length Prefix (RLP) encodes the actual value. Note that at the time of this writing, while Ethereum still utilizes this methodology for serializing objects, RLP will be replaced by a more optimal form of serialization in the future, as mentioned here. We will discuss its role in the current implementation.
Storing data properly requires a form of data serialization, meaning, translating data structures or object state into a format that can be stored or transmitted across a network to be reconstructed later. Ethereum utilizes RLP for its data serialization. The purpose of RLP is to encode nested arrays of binary data or to encode structure. The RLP encoding function will take in an item and encode the passed in item accordingly.
The Ethereum wiki defines an item as follows:
- A string (ie. byte array) is an item
- A list of items is an item
The Ethereum wiki also provides an in-depth definition of RLP encoding based on the length of bytes among other factors and provides strong examples.
The optimizations of the Modified Merkle Patricia Trie are a result of added complexity, namely the introduction of different node types. It contains four distinct node types :
- NULL Nodes (represented as an empty string)
- BRANCH Nodes (17-item node, [ v0…..v15, vt ] )
- LEAF Nodes (2-item node, [ encodedPath, value ])
- EXTENSION Nodes (2-item node, [ encodedPath, key ])
We briefly discussed leaf and extension nodes above, however we did not specify their importance. Without each of the distinct node types, if we were to travel down a 64 character path (in the case of the state trie) we may encounter a node with no differing paths available and this node could have empty values in each index (one for each of the 16 hex characters) besides the target index . To prevent these occurrences and reduce the distance travelling down such a path, Ethereum utilizes extension nodes [ encodedPath, key ], where the encodedPath (recall HP encoding above) consists of a ‘partial path’ that allows us to skip ahead, and the key is then used for the next database lookup in Leveldb  .
Leaf nodes [encodedPath, value], if you recall from when we discussed hex-prefixes, can be identified by their terminator flag in the first nibble of the encodedPath. The same situation described above occurs, the encodedPath contains the ‘partial path’ which fast-forwards us to the complete end of the path where the value is the target value itself . To further our understanding of these node types, let’s look at the example in Figure G.
Figure G. Source: Lee Thomas, https://ethereum.stackexchange.com/questions/268/ethereum-block-architecture
In the top-right, we have a simplified world state. The keys represent addresses and the values represent balances. In this example, the root node is an extension node [encodedPath, key] and we know that based on the hex-prefix. Also note that in this example, all the keys share the same nibble of a7, thus they’re grouped accordingly, an attribute of radix tries. Since the root is an extension node [encodedPath, key] the ‘next node’ bucket will be a hash pointing to the next node, which in this case is a branch node [ v0…..v15, vt ]. If we follow the first key in our simplified state (a711355), we use the 1 after a7, to look in the first index of the branch node. This leads us to create a leaf node where if we compare the key and the path, they’re the same, allowing us to skip ahead and leading us to the value or the account balance of 45.0 ETH.
Ethereum’s Trie Data Structures | The Details
Ethereum utilizes its ‘Modified Merkle Patricia Trie’ for each of its 4 trie data structures:
- Receipt Tree
- State Tree
- Storage Tree
- Transaction Tree
The root for each of the tries is a Keccak 256-bit hash, and three out of four of the tries listed above exist within the block header, Receipt Tree, State Tree and Transaction Tree. The Storage Tree’s root lives within the RLP encoded data value within the State Trie. See Figure H below.
As mentioned above, the transaction root exists within the block header, and as you can guess, its purpose is to record transactions. Since the ordering of the data is mostly decided upon by the miner, we do not know what the data looks like until it is mined. We do know what parameters are used to compose the transaction trie, and they’re as follows:
- Account nonce
- Gas price
- Gas limit
- Transfer value
- Transaction signature values
- Account initialization (if transaction is of contract creation type), or transaction data (if transaction is a message call)
Once the block is mined, the transaction trie is never updated .
Again, the receipt root lives within the block header and its purpose it to record the outcome of a transaction. According to the yellow paper, the receipt trie can also be useful for zero-knowledge proofs or searches. The parameters that make up the receipt trie are as follows:
- Post-transaction state
- Cumulative gas used
- Bloom filter created from the information of the above logs
This trie never updates.
The one and only one global state trie. It contains a key-value pair for every Ethereum account on the network, where the key is an ethereum address and the value is RLP encoded ethereum account. An ethereum account and the state trie is comprised of the following fields:
- Storage Root
Unlike the transaction and receipt tries, the state trie updates over time….constantly.
The storage trie is where all the contract data lives, and there is a separate storage trie for each account .
We have now covered each trie, its purpose and contents. Figure I below displays a strong representation of exactly what we covered above. As you can see, we have the block header on the right which contains the root for the ‘World State Trie’, ‘Transaction Trie’, and ‘Transaction Receipt Trie’. It also depicts the contents within each of those trees, including the storage root for the ‘Account Storage Trie’.
Figure I. Source: Lee Thomas, https://ethereum.stackexchange.com/questions/268/ethereum-block-architecture
This article’s purpose was to provide readers with an improved understanding of trie data structures in Ethereum. I hope this articles purpose has been fulfilled, but if not, I have included a number of resources and exercises below. Thank you for taking the time to read.
Branch: A 17-item structure whose first sixteen items correspond to each of the sixteen possible nibble values for the keys at this point in their traversal. The 17th item is used in the case of this being a terminator node and thus a key being ended at this point in its traversal .
Extension: A two-item structure whose first item corresponds to a series of nibbles of size greater than one that are shared by at least two distinct keys past the accumulation of nibbles keys and branches as traversed from the root. The hex-prefix encoding method is used and the second parameter to the function is required to be false .
Leaf: A two-item structure whose first item corresponds to the nibbles in the key not already accounted for by the accumulation of keys and branches traversed from the root. The hex-prefix encoding method is used and the second parameter to the function is required to be true .
Nibble: Hex form of 4 bits (0x1, 0x4, 0xf)
Helpful Resources and Exercises
Exercises and Ethereum Trie Walk-Through
Trie Data Structures