Trie is a main data structure used in Ethereum.

Data structure in Ethereum | Episode 2: Radix trie and Merkle trie.

Published in
4 min readFeb 8, 2018

--

In the episode 1 and 1+, we got familiar with some encoding/decoding algorithms which are used for constructing Ethereum data. Now, we are moving on to the organization of data in Ethereum. I will introduce two kinds of trie, they are Radix trie and Merkle trie. Actually, they are not used in raw, Ethereum mixed them and created a new trie that more optimized and named Patricia trie. This episode is such a preparation for us to understand Patricia trie.

If you never ever heard about trie or tries, don’t worry 👌. They are very easy to understand.

Trie

Trie is a word or a terminology that represents digital tree in science computer. Sometime, we can see that ‘tree’ is used, it’s ok because of the same meaning.

In others word, trie is an ordered data structure that is used to store a dynamic set or associative array which is formed to key-value where the keys are usually strings.

Trie.

We can get familiar with some terminologies of trie by the image above. Further, set of root, internal node, leaf will be called node in common.

Dataset

We will use this data sample for all examples.

Dataset.

In dataset, key is strings and value is integers.

Radix trie

Radix trie is used to optimize for searching 🔭.

In radix trie, the key in dataset will be the path to reach the value.

Basic radix trie.

As you can see, every path of trie represents a character which is ASCII and it is used for searching value.

For example, we are looking for the value of key which is dodo. Just start from the root, try to look for the d path first and keep descenting whole the path. The final result is the red line and green node with value 4.

Radix trie.

However, the branch of house and houses key was degraded, too many internal nodes with null value. In order to reach the value of houses , we have to descend down the path so many times. It causes wasted space.

So, it can be improved by combining the degraded path. Now, a path is not represented by a single character, instead of that a string.

We get a improvement on radix trie as the image beside.

To reach houses node, we just need to descent twice and it seems to be good for searching.

Merkle trie

Merkle trie is used to authenticate data

In merkle trie, its data is used to create a deterministic cryptographic hash that help to authenticate data.

To get the details, we move to an example:

Merkle trie.

Whole data will be stored at leafs. Value of parent of those leafs will be equal to Hash(valueOfChild1, valueOfChild2, …) .

Deterministic cryptographic hash.

If we try to change the value of 4th node to 44. So the parents on the path to root from 4th node will be totally changed, H2 → H’2 | H5 → H’5 | Root → NewRoot .

Thus, if we hold value of root, we could verify the consistency of data by rebuilding the trie to get root and compare it with our root. Practically, it is impossible to fake data without changing value of root.

Conclusion & References

By understanding this two kinds of trie, we got nearly to the final target.

Why did I write this series so long? Because it is a SERIES 😑

The next episode, I will show you how Ethereum combine those and some improvement to create Patricia trie with more optimized things and also keeping main attributes which are optimization of searching and deterministic cryptography.

Get Best Software Deals Directly In Your Inbox

--

--

A lucky guy was born in the Age of Cryptocurrency Boom