Merkle Trees: the Origins of Cryptocurrency

6 min readApr 27, 2022

Tree in a field — Photo by Simon Wilkes on Unsplash

A Certified Digital Signature, dubbed that antique paper from 1979, was Ralph C. Merkle’s solution for an efficient and easily distributable digital signature system. Merkle sought to solve two basic questions:

How can we verify the authenticity of a digital item?
How can we do so in an efficient, lightweight, and secure manner?

By building up from a trusted encryption function Merkle theorized that the creation and implementation of his system would be “pre-certified,” avoiding the years that it would take to develop and trust another system. It seems this theory was correct as today Merkle Trees are the basis of digital transmission protocol. They can be found in places including but not limited to Git, DynamoDB, BitTorrent, and of course Blockchain technology.

He starts by exploring the idea of one-way functions. If you are not familiar, these one-way functions form the basis of what we understand to be hashing. Differing from encryption, which is meant to be reverted using a key (ergo a two-way function), hashing is never meant to be undone. A huge part of cyber security is based on the principle that it is computationally impossible to derive the contents from a hash. If it were possible, every password you have ever created could be available to anyone.

A one-way function is defined as such: given F(x) = y, y is easy to compute given F and x, but x is functionally impossible to compute given y and F. It generally takes a larger amount of data and compresses it to a fixed size. Hashing algorithms such as Sha-256 and Sha-512 compress your data into blocks with… you guessed it 256 and 512 bits respectively. Generally, these one-way functions leverage a certified encryption function C(k,p) = c where k is the private key, p is the plaintext message, and c is your coded message aka hash. Using these functions as a starting point we can take data, of any size, and output a unique hash*.

*It is technically possible to have collisions, yet it is mathematically very very very unlikely — if you are curious about that you can read up on the pigeonhole principle.

So if we have a person who computed F(x) = y we can be confident that, no matter who knows y and F, only one person knows x. This is useful to create unalterable and distributed contracts between A and B parties. A will compute the hash and give F and y to B. B will be told that they can carry out their portion of the contract when they have x. Since it is impossible to calculate x, the only way they will ever know it is if A tells them. A also cannot change what the value of x is, because only that value will hash to y.

If B violates their side of the deal and executes their action without A’s permission, any 3rd party will be able to validate that B did not have x. No matter how many times B tries to guess x before they go to the judge, math is against them and they will not be able to guess x. It will be clear that they did not have the go-ahead.

Merkle examines this using the Lamport-Diffie One Time Signature Method. The problem with just using a cryptographic function C(k,p) on each bit in the message is that B could mischievously alter x once they had received it by changing 1s to 0s. Lamport and Diffie got around this by concatenating the x with the one’s complement of itself to the end before hashing. Since 0s cannot be changed to 1s, x becomes immutable. All the 1s in the previous message are encoded in the second half of the second message as unalterable 0s.

Although this is quite clever, it has now doubled the size of x that both A and B will have to hold on to. If x is small this isn’t too big of a deal, but if B has contracts with thousands of people and each one of those hashes is derived from files that are GBs in size, B will run out of storage quite quickly.

Merkle optimized this space usage by electing to append the count of zeroes to the end of the message instead of the one’s complement. This nearly halves the space requirement of the Lamport-Diffie method as the size of the s appended to the end will be Log2(n) where n is the length of x. For example, if x is 8 bits long, we can append a 3-bit number that could represent the possible 0s in the string. Thus if x is 01010101 we would instead hash 01010101,100.

Winternitz further improves upon this by trading time for space. In this method, y is computed by running F on x repeatedly. For example, y = F16(x). Doing it in this manner allows y to sign 4 bits of information instead of just 1. Therefore, the space complexity can be decreased by a factor of 4. Unfortunately, since F16(x) is public a malicious party could compute F17(x) and claim that to be the real y.

This left Merkle with the problem: how can a protocol be designed that eliminates large storage requirements, while also ensuring that A’s y is the correct y? Tree authentication. Or as it is known today, a Merkle Tree. Using a binary tree allowed Merkle to apply a divide-and-conquer approach to hashing. by adding just log2(n) transmissions.

Merkle Tree Diagram — Photo from GeeksforGeeks

A will therefore not just calculate each hash, but also create a recursive tree combining each of the previous hashes so that there is one top hash or root hash. This root hash encompasses all information below it on the chain, so a change in any value will cause the root hash to change. Now A can provide both y and the authentication path. B will then be able to validate x as well as be sure that y came from party A.

Data is solely stored in the bottom nodes of the tree, known as leaf nodes. Every level above the leaves does not have any unique information, just hashes.

This data model, which was derived for digitally signing data, was the very first piece of the puzzle to creating modern-day cryptocurrencies. This digital signature system allows us to trust data transmissions in a no-trust system. This has fueled the decentralized nature of blockchains; eliminating the need for a “trusted intermediary” to validate the transaction.

Further, a Merkle Tree can be built from just its leaf nodes, allowing piecemeal distribution of data that can come from different parties. When you request a copy of a blockchain, all the other participants (also called nodes) will send you pieces of the chain so that any one party is not responsible for sending it all to you. If the pieces do not add up to the desired root node, the incorrect piece will be identified and re-requested from a different source.

Other applications of Merkle Trees are working behind the scenes everywhere:

Git uses these trees to store your commit history and to identify unstaged diffs on your indexed items.
BitTorrent uses this strategy in a P2P manner. Other peers on the network will send you pieces of the puzzle and then your computer will compute the root hash to see if you got what was expected.
Dropbox moves data in a similar manner to speed up file transfers. They also split individual files up into pieces so that if part of a file change, only the piece that contains the change needs to be re-uploaded. For example, you are re-uploading a 10 GB video that you edited and that video is divided amongst 10 Merkle leaf nodes. If only two of those nodes were modified, you will only need to upload 2 GB of information.
AWS even does this with their Dynamo DB.

If you liked this article find more on my website!

Merkle Trees: the Origins of Cryptocurrency

Written by Colin Mills