Applications of the Merkle Tree Data Structure

Bee Davis
5 min readJun 16, 2020

--

The term Merkle tree data structure has many applications, especially in the Bitcoin and crypto technologies of recent times. The Merkle tree is one of the most effective data structures used for data integrity verifications. This informative piece covers the basics of the Merkle tree data structure and the applications of the data structure.

Basic Explanation of the Merkle Tree Data Structure

A Merkle tree is a way of associating a group of elements such that you cannot forge or alter their identity, qualities, and association. A Merkle tree binds the elements together by creating an equation that serves as a special unforgettable signature for that specific group of elements. The tree works by using a mathematical algorithm known as a hashing function for the qualities of the items (such as their file name, file size, or any other special or unique properties).

For example, the elements of “Baked Alaska” are quite simple. It includes Milk, Cream, Eggs, Sugar, Flour, etc. Yet, Baked Alaska is one of the more difficult desserts to create because of the expertise required to assemble the ingredients. Any small change in the amount of one ingredient or the timing of its creation usually results in a catastrophe.

If you forget something or try to add something “extra,” it will also ruin the recipe. It doesn’t work, for example, to make the dish without sugar and then try to add it later. The process of cooking the dish binds the ingredients into a special configuration that you cannot reverse.

In a Merkle tree, the ingredients are the elements, and the hashing function is the cooking process. A skilled engineer would tell if the number or quality of the elements changed in any way.

Applications of the Merkle Tree

An application that needs to keep track of daily stock trading activity is a perfect example where a Merkle tree might be useful. An organization that facilitates stock trades relies on the absolute accuracy of the daily record.

The details in the activity are important, down the millisecond since large transactions executed within seconds of each other could mean the difference in millions of dollars. Creating a new Merkle tree every time a new transaction record comes up could be one way to ensure that the set of records is safe.

Additionally, a Merkle tree to keep a log of any time anyone accesses the records. Every time you add a new log record, the application would create a new Merkle tree with all the log entries. The root of the transaction records and the root of the access log trees could help create a “Mega Meta” tree.

That way, the access logs, and the transaction records are bound together in one unalterable, verifiable record. To reference my analogy above, this would mean that each tree would produce a different unique dish, and each mega-meta tree would produce a different unique full course meal.

Trying to replace or change the ingredients would ruin/invalidate the dish, and the ruined/invalidated dish would ruin/invalidate the meal. In this application, the Merkle tree helps to bind important elements together in a way that preserves their properties and timing so you can trust the records.

How the Merkle Tree Can Guarantee the Security of an Application

In the example above, the trustworthiness of the records relies on the inability of anyone, including the company that owns the application, to tamper with the records.

An attacker seeks to tamper with the files and to make their record of access to the files undetectable, so their records seem credible. In other data structures where confidentiality is more important, securing the data means hiding it from view.

For data in a Merkle tree to be more secure, it best kept public. Hence, there are several sources of independent consensus on the record of the tree itself. The layers of Merkle tree data structure secure the integrity of the records, and creativity links them with the log of who accessed the records. But, since the Merkle tree makes it near impossible to change anything about any of the elements without the risk of invalidating all the dependent records up the chain, you can expect the trustworthiness of the records as a set is 99.99% t safe.

Trying to forge a record without a trace would mean finding a collision of a hash of what might be millions of records. Furthermore, a forgery of a full day of records would surface when creating and verifying the mega-meta tree. The security that comes with the use of a Merkle tree allows the users to rest assured that the records they are viewing are accurate.

Are There Restrictions On The Accuracy Of The Merkle Tree Data Structure?

While the Merkle tree protects from forgery or alteration after the creation of a record, it cannot protect from inaccurate records entered into the tree at the time of creation.

Merkle trees are not indestructible. Weaknesses can compromise their integrity; in the way, BTC Wallets uses them to verify a transaction. The attack relies on the fact that in BTC Merkle trees is not easy, and sometimes impossible to tell the difference between the hash of an inner node and a leaf node since you create both by concatenating a 32-bit hash.

If a BTC wallet uses a more simplified form of verification, one that looks at a branch of the Merkle tree, an attacker can trick the wallet into thinking that a leaf node is an inner node with a fake proof.

Also, the speed at which you can traverse this data structure is uncertain. The Merkle tree performs well for a smaller body of data, but not as fast as a heap or a stack.

Conclusion

The Merkle tree has many applications, aside from torrents and blockchains, several applications like in scalable databases (Dynamo DB and Apache Cassandra), Certificate authorities for the verification of certificate transparency and in digital signatures. Thus, Markle trees work in any system that needs to check inconsistency.

--

--

Bee Davis

Socially Aware Data Science and CyberSecurity Engineering Leadership