Diving into Ethereum’s world state
Ethereum comprises of many constituent parts. This post seeks to de-construct Ethereum to provide you with an understanding of its data storage layer. We will introduce the concept of blockchain “state”. We will cover the theory behind the Patricia Trie data structure and demonstrate Ethereum’s concrete implementation of tries using Google’s leveldb database. This post also links to a step by step guide which will assist you in installing and configuring your very own Ethereum private network (including mining). From this point you will be able to execute transactions and explore how Ethereum’s “state” responds to activities such as transactions.
What is blockchain “State”?
Bitcoin’s “state” is represented by its global collection of Unspent Transaction Outputs (UTXOs). The transfer of value in bitcoin is actioned through transactions. More specifically, a bitcoin user can spend one or more of their UTXOs by creating a transaction and adding one or more of their UTXOs as the transaction’s input.
A full explanation of UTXOs is out of scope for this article. However, we mention UTXOs in the following paragraph, to point out a fundamental difference between bitcoin and Ethereum.
The following two bitcoin examples will provide contrast between bitcoin’s UTXO model and the concept of Ethereum’s world state.
Firstly, bitcoin UTXOs can not be partially spent. If a bitcoin user spends 0.5 bitcoin (using their only UTXO which is worth 1 bitcoin) they have to deliberately self-address (send themselves) 0.5 bitcoin in return change. If they don’t send themselves change, they will loose the 0.5 bitcoin change to the bitcoin miner who mines their transaction.
Secondly, at the most fundamental level, bitcoin does not maintain user account balances. With bitcoin, a user simply holds the private keys to one or more UTXO at any given point in time. Digital wallets make it seem like the bitcoin blockchain automatically stores and organizes user account balances and so forth. This is not the case.
A user account balance in bitcoin is an abstract notion. Realistically a user’s account balance is the sum total of each individual UTXO (for which that user holds the corresponding private key). The key[s] which a user holds can be used to individually sign/spend each of the UTXOs.
The UTXO system in bitcoin works well, in part, due to the fact that digital wallets are able to facilitate most of the tasks associated with transactions. Including but not limited to:
a) handling UTXOs
b) storing keys
c) setting transaction fees
d) providing return change addresses
e) aggregating UTXOs (to show available, pending and total balances)
Interestedly, a back-up of a non-deterministic wallet (like the bitcoin core wallet pictured above) only provides a snapshot of the UTXOs (at that point in time). If a user performs any transactions (sending or receiving), the original back up, which they made, will be out of date.
To summarize, we know that:
- the bitcoin blockchain does not hold account balances
- bitcoin wallets hold keys to UTXOs
- if included in a transaction, an entire UTXO is spent (in some cases partially received back as “change” in the form of a brand new UTXO)
In contrast to the information above, the Ethereum world state is able to manage account balances, and more. The state of Ethereum is not an abstract concept. It is part of Ethereum’s base layer protocol. As the yellow paper  mentions, Ethereum is a transaction-based “state” machine; a technology on which all transaction based state machine concepts may be built.
Let’s start at the beginning. As with all other blockchains, the Ethereum blockchain begins life at its own genesis block. From this point (genesis state at block 0) onward, activities such as transactions, contracts and mining will continually change the state of the Ethereum blockchain. In Ethereum, an example of this would be an account balance (stored in the state trie) which changes every time a transaction, in relation to that account, takes place.
Importantly, data such as account balances are not stored directly in the block headers of the Ethereum blockchain. Only the root node hashes of the transaction trie, state trie and receipts trie are stored directly in the block headers. This is illustrated in the diagram below.
You will also notice, from the above diagram, that the root node hash of the storage trie (where all of the smart contract data is kept) actually points to the state trie, which in turn is hashed and stored in the block header. We will zoom in and cover all of this in more detail soon.
There are two vastly different types of data in Ethereum; permanent data and ephemeral data. An example of permanent data would be a transaction. Once a transaction has been fully confirmed, it is recorded in the transaction trie; it is never altered. An example of ephemeral data would be the balance of a particular Ethereum account address. The balance of an account address is stored in the state trie and is altered whenever transactions against that particular account occur. It makes sense that permanent data, like mined transactions, and ephemeral data, like account balances, should be stored separately. Ethereum uses trie data structures (as broadly outlined above), to manage data. Here is a quick overview on tries.
The trie (or tree)
The trie (or tree) is a well known data structure which is used for storing sequences of characters. Ethereum exclusively uses what is known as the “Practical algorithm to retrieve information coded in alphanumeric”(Patricia) trie. The main advantage of the Patricia trie is its compact storage. We will now analyze the inner workings of the standard (more traditional) trie vs the Patricia trie.
Rules for adding a word to the trie
We follow the search path for the word we are adding. If we encounter a null pointer, we create a new node. When we have finished adding our word we create a null pointer (terminator). When adding a (shorter) word which is contained in another (longer) word, we just exhaust all of the characters and then add a null pointer (terminator)
Rules for deleting a word from the trie
We search for for a leaf (the very end of a branch) on the trie that represents the string (which we are wanting to delete). We then start deleting all nodes from the leaf back to the root of the trie. Unless we hit a node with more than one child; in this case we stop.
Rules for searching for a word in the trie
We examine each of the characters in the string for which we are searching and follow the trie for as long as it provides our path (in the right sequence). If we encounter a null pointer before exhausting all of the characters in the string (which we are searching for) then we can conclude that the string is not stored in the trie. On the contrary, if we reach a leaf (the end of a branch) and that path (from the leaf back to the root of the trie) represents out string. We conclude that the string is stored in the trie.
Rules for adding a word to the Patricia trie
Patricia tries group all common characters into a single branch. Any non common characters will constitute a new branch in the path. When adding a word to a Patricia trie we exhaust all of the characters and then add the null pointer (terminator).
Rules for deleting a word from the Patricia trie
Same as with a traditional trie, except for when deleting nodes (from the leaf back to the root) we must ensure that all parent nodes must be in possession of at least 2 child nodes. It is OK for a single child node to just have characters and a null pointer (this occurs in the diagram above, at the very end of every word). It is also OK for a single node to just have a null pointer (this occurs if a shorter word is contained in a longer word). See diagram above which illustrates how “wood” and “wooden” co-exist in the same trie.
Importantly, when deleting from a trie, a path can not be left with a parent nodes which just connects to single child node. If this occurs (when deleting, we need to concatenate the appropriate characters to resolve this). This is illustrated in the diagram below (where we delete the word “word” from the trie).
Rules for searching for a word in the Patricia trie
The rules for searching the Patricia trie are the same as for searching the standard trie.
Similarities between the trie and Patricia trie
The run time “O” for adding is O(mN) where “m” is the length of the string we are adding and “N” is the size of the available alphabet
The run time for deleting is O(mN) where “m” is the length of the string which we want to delete and “N” is again the size of the available alphabet
The run time for searching is O(m) where “m” is the length of the string we are searching for
Main difference between the trie and Patricia trie
The main advantage of using the Patricia trie is in relation to storage.
The storage requirement “O” for the standard trie is O(MN) where “M” is the total length of all strings in the trie and “N” is the size of the available alphabet.
The storage requirement “O” for the Patricia trie is O(nN+M) where “n” is the number of strings stored in the Patricia trie, “N” is the size of the available alphabet and “M” is the total length of all strings in the trie.
In short, you will have noticed a marked difference in the depth of the tries (pictured in the first 2 trie images above). The Patricia trie is less deep (more shallow). This is due to the Patricia trie’s ability to group common characters (and concatenate null pointers to leafs).
A closer look at the trie structure in Ethereum
Let’s look at the state, storage and transaction tries in a bit more depth.
State trie — the one and only
There is one, and one only, global state trie in Ethereum.
This global state trie is constantly updated.
The state trie contains a key and value pair for every account which exists on the Ethereum network.
The “key” is a single 160 bit identifier (the address of an Ethereum account).
The “value” in the global state trie is created by encoding the following account details of an Ethereum account (using the Recursive-Length Prefix encoding (RLP) method):
The state trie’s root node ( a hash of the entire state trie at a given point in time) is used as a secure and unique identifier for the state trie; the state trie’s root node is cryptographically dependent on all internal state trie data.
Storage trie — where the contract data lives
A storage trie is where all of the contract data lives. Each Ethereum account has its own storage trie. A 256-bit hash of the storage trie’s root node is stored as the storageRoot value in the global state trie (which we just discussed).
Transaction trie — one per block
Each Ethereum block has its own separate transaction trie. A block contains many transactions. The order of the transactions in a block are of course decided by the miner who assembles the block. The path to a specific transaction in the transaction trie, is via (the RLP encoding of) the index of where the transaction sits in the block. Mined blocks are never updated; the position of the transaction in a block is never changed. This means that once you locate a transaction in a block’s transaction trie, you can return to the same path over and over to retrieve the same result.
Concrete examples of tries in Ethereum
The main Ethereum clients use two different database software solutions to store their tries. Ethereum’s Rust client Parity uses rocksdb. Whereas Ethereum’s Go, C++ and Python clients all use leveldb.
Ethereum and rocksdb
Rocksdb is out of scope for this post. This may be covered at a later date, but for now, let’s explore how 3 out of the 4 major Ethereum clients utilize leveldb.
Ethereum and leveldb
LevelDB is an open source Google key-value storage library which provides, amongst other things, forward and backward iterations over data, ordered mapping from string keys to string values, custom comparison functions and automatic compression. The data is automatically compressed using “Snappy” an open source Google compression/decompression library. Whilst Snappy does not aim for maximum compression, it aims for very high speeds. Leveldb is an important storage and retrieval mechanism which manages the state of the Ethereum network. As such, leveldb is a dependency for the most popular Ethereum clients (nodes) such as go-ethereum, cpp-ethereum and pyethereum.
Whilst the implementation of the trie data structure can be done on disk (using database software such as leveldb) it is important to note that there is a difference between traversing a trie and simply looking at the flat key/value database.
To learn more, we have to access the data in leveldb using the appropriate Patricia trie libraries. To do this we will need an Ethereum installation.
In light if this, an additional post has been written (to accompany this one). The additional post entitled “Running a quick Ethereum private network for experimentation and testing” provides a step by step guide which will assist you in installing and configuring your very own Ethereum private network.
Once you have set up your Ethereum private network, you will be able to execute transactions and explore how Ethereum’s “state” responds to network activities such as transactions, contracts and mining. If you are not in a position to install an Ethereum private network, that’s OK. We will provide our code examples and screen captures from our Ethereum private network.
Analysing the Ethereum database
As we mentioned previously there are many Merkle Patricia Tries (referenced in each block) within the Ethereum blockchain:
- State Trie
- Storage Trie
- Transaction Trie
- Receipts Trie
The following sections assume that you have installed and configured your very own Ethereum private network, or that you are happy to just follow along as we run some software and talk to Ethereum’s leveldb database.
To reference a particular Merkle Patricia Trie in a particular block we need to obtain its root hash, as a reference. The following commands allow us to obtain the root hashes of the state, transaction and receipt tries in the genesis block.
Note: If you would like the root hashes of the latest block (instead of the genesis block), please use the following command.
Installing npm, node, level and ethereumjs
From this point, running the following code will print a list of the Etherem account keys (which are stored in the state root of your Ethereum private network). The code connects to Ethereum’s leveldb database, enters Ethereum’s world state (using a stateRoot value from a block in the blockchain) and then accesses the keys to all accounts on the Ethereum private network.
Interestingly, accounts in Ethereum are only added to the state trie once a transaction has taken place (in relation to that specific account). For example, just creating a new account using “geth account new” will not include that account in the state trie; even after many blocks have been mined. However, if a successful transaction (one which costs gas and is included in a mined block) is recorded against that account, then and only then will that account appear in the state trie. This is clever logic which protects against malicious attackers continuously creating new accounts and bloating the state trie.
Decoding the data
You will have noticed that querying leveldb returns encoded results. This is due to the fact that Ethereum uses its own specially “Modified Merkle Patricia Trie” implementation when interacting with leveldb. The Ethereum Wiki provides information in relation to the design and implementation of both Ethereum’s Modified Merkle Patricia Trie and Recursive Length Prefix (RLP) encoding. In short, Ethereum have extended on the trie data structures described above. For example the Modified Merkle Patricia contains a method which can shortcut the descent (down the trie) through the use of an “extension” node.
In Ethereum, a single Modified Merkle Patricia trie node is either:
- an empty string (referred to as NULL)
- an array which contains 17 items (referred to as a branch)
- an array which contains 2 items (referred to as a leaf)
- an array which contains 2 items (referred to as an extension)
As Ethereum’s tries are designed and constructed with rigid rules, the best way to inspection them is through the use of computer code. The following example uses ethereumjs. The ethereumjs repositories are easy to install and use; they will be perfect for us to quickly peer into Ethereum leveldb database.
The code below (when provided with a particular block’s stateRoot as well as an Ethereum account address) will return that account’s correct balance in a human readable form.
We have demonstrated above that Ethereum has the ability to manage its “state”. This clever upfront design has many advantages.
Given that mobile devices and Internet of Things (IoT) devices are now ubiquitous, the future of e-commerce depends on safe, robust and fast mobile applications.
As we acknowledge advances in mobility, we also acknowledge that the constant increase in blockchain size is inevitable. It is not practicable to store entire blockchains on everyday mobile devices.
Speed, without compromising security
The design of Ethereum’s world state and its use of the Modified Merkle Patricia Trie provides many opportunities in this space. Every function (put, update and delete) performed on a trie in Ethereum utilizes a deterministic cryptographic hash. Further, the unique cryptographic hash of a trie’s root node can be used as evidence that the trie has not been tampered with.
For example any changes to a trie’s data, at any level (such as increasing an accounts balance in the leveldb database) will completely change the root hash. This cryptographic feature provides an opportunity for light clients (devices which do not store the entire blockchain) to quickly and reliably query the blockchain i.e. does account “0x … 4857" have enough funds to complete this purchase at block height “5044866”?
“The size complexity of a Merkle proof is logarithmic in the quantity of data stored. This means that, even if the entire state tree is a few gigabytes in size, if a node receives a state root from a trusted source that node has the ability to know with full certainty the validity of any information with the tree by only downloading a few kilobytes of data in a proof.”
An interesting idea, mentioned in the Ethereum white paper  is the notion of a savings account. In this scenario two users (perhaps a husband and wife, or business partners) can each withdraw 1% of the accounts total balance per day. This idea is only mentioned in the “further applications” section of the white paper, but it sparks interest because of the fact that, in theory, it could be implemented as part of Ethereum’s base protocol layer (as apposed to having to be written as part of a second layer solution or third-party wallet). You may recall our discussion about bitcoin UTXOs at the start of this article. UTXOs are blind to blockchain data, and as we discussed, the bitcoin blockchain does not actually store a users account balance. For this reason the base protocol layer of bitcoin is far less likely (or perhaps unable to) implement any sort of daily spend limits.
As work continues in this space we will see a lot of development in light clients. More specifically, safe, robust and fast mobile applications, which can interact with blockchain technologies.
A successful blockchain implementation in the e-commerce space must bolster speed, safety and usability. It is always possible to improve consumer confidence as well as increase mainstream adoption by providing superior usability, safety and performance through smart design. All things which the CyberMiles team are focusing on as their work moves forward.
 Wood, G., 2014. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper, 151.