Under The Hood : Extending Merkle Tree

Guillaume Bonnot
Caasiope Labs
Published in
4 min readDec 17, 2018

This article is part of the Under The Hood series, where we will take an in-depth look at blockchain architecture and how blockchains work under the hood.

In this article, we will focus on how merkle trees are implemented by different blockchain protocols and what their features bring to the whole project.

Bitcoin

https://blog.ethereum.org/2015/11/15/merkling-in-ethereum/

Bitcoin is using a merkle tree to compute the merkle root hash of the transactions included in a block which allows partial integrity verification using merkle proofs.

The merkle tree is finalized and the hash is computed when the block is created. The merkle tree of a given block will never be updated.

The Bitcoin merkle tree is implemented using as a binary tree that is populated at creation time (GetRoot). The hash of the tree is the hash of the root node. The hash of a node is computed using the hash of the child nodes. The hash of the tree is computed in the same time the tree is constructed.

Bitcoin is making a very basic use of the possibility offered by merkle trees, due to the elegant simplicity of the bitcoin protocol. More complex blockchain protocols will be using this tool support more advanced features.

Ethereum

Ethereum is using its own implementation called Merkle Patricia Tree which is basically a radix trie modified to compute the merkle hash of the tree.

https://blog.ethereum.org/2015/11/15/merkling-in-ethereum/

Ethereum is making an extensive use of this custom data structure, with no more than 4 different usages of their implementation of merkle tree :

  1. State Trie : global state containing every account, updated every block
  2. Storage Trie : one for every account, containing all contract data
  3. Transaction Trie : one per block, containing all transactions, never updates
  4. Receipts Trie : one per block, containing all receipts, never updates

The Ethereum implementation is using a compact radix-16 trie with 4 different node types :

  1. Value Node : a leaf containing the data to store
  2. Full Node : a branch that is fully loaded
  3. Short Node : a compact branch linking to a node deeper in the trie
  4. Hash Node : a node that need to be loaded from the database

Due to the huge amount of data that needs to stored in the state trie, Ethereum is using a lazy-loading architecture for the trie, where node are loaded on demand from the database.

In consequences, the trie is not thread-safe and cannot be concurrently explored by different threads.

A Hash Node can be resolved to be replaced by the real node by fetching the data associated to the hash in the database.

The trie is mutable and can be modified on the fly by a single thread but it will only be persisted after the changes are committed and saved in the database. This allows to rollback the changes made to the trie, which is mainly used to simulate contract executions.

The architecture choice made by Ethereum is to maintain the global state in a single oversized trie that is partially loaded and can only be accessed by a single thread.

Caasiope

The Caasiope Network is also using a custom implementation of a merkle tree that is used to store the ledger state like Ethereum but keeps some of the simplicity from bitcoin.

The ledger state has a merkle tree that contains the state of every account at a given height. Because the protocol doesn’t support smart-contracts but only built-in features, the size of the state is reduced and can be fully loaded in memory.

The Caasiope Network implementation is using a radix-16 trie with fixed depth, where items are located on a path with fixed length.

The tree can be finalized to become immutable and be explored by multiple concurrent threads.

Once finalized, the immutable tree can be cloned to create a new mutable tree that will be modified when the ledger state needs to be updated. In order to optimize space and time, the nodes from the previous tree are reused in the new tree.

When an item is added or updated in the new tree, the nodes from the previous tree are replaced by new nodes so the previous tree remains the same.

Only a trie that has not been finalized can update the item located on the path determined by the fixed length key.

The hash is computed when the trie is finalized and only nodes that where updated will recompute their new hash.

The Caasiope Network is using an implementation of the merkle tree that combines the advantages of the data structures while also enabling different thread to concurrently update their own version based on original one.

Looking for an opensource blockchain community ? Join us on www.caasiope.net

--

--

Guillaume Bonnot
Caasiope Labs

Software & Blockchain Architect, CEO @ Helios Services