Merkle Trees in 3 Minutes or Less
Because blockchain is still such an emerging ecosystem, there are still many things holding back it’s widespread adoption. Among these factors is an inherent understanding of the core technological concepts that comprise some of these projects. It is almost a right of passage to have had at least one conversation with someone skeptical about cryptocurrencies and/or blockchain technology but when pressed on it, concede as their uninformed arguments collapse. For this reason, I am making it my goal to simplify some of the more complex terms and concepts. This way they can become part of everyday discourses and we can begin to clear up some of the stigma surrounding this new technology.
Ok so, let’s start with Merkle trees.
Merkle trees are structures used to validate large amounts of data in an efficient manner. They are able to not only verify that the data received from other peers in a peer-to-peer network like Bitcoin or Ethereum are unaltered but also that the blocks being sent are legitimate. Merkle roots, however, can be understood as the signature of all the transactions included within a single block. In Bitcoin, for example, the Merkle root can be found in the block header (along with the hash of the previous block, the timestamp and the nonce). How the Merkle root is determined, what it’s purpose is and how it relates to a Merkle tree will be the fundamental questions I hope to shed some light on.
In order to understand how Merkle trees work, it is first important to understand what cryptographic hash functions (CFH) are.
In a broad sense, a hash function takes a certain input and returns an output (a hash) often in the form of an alphanumeric string.
Here’s a quick sumary of the five fundamentals of a CFH (will not go in depth on this here):
1) The hash must be easy (computationally) to get from a function
2) It must be computationally infeasible to work backwards from the hash to get the original input
3) It has to be deterministic — the same input must give the same hash
4) A small change in the input would give a completely different hash (see Figure 2)
5) And it must also be infeasible to find two messages that produce the same hash
Now that you have an understanding of a CHF, the intricacies of a Merkle tree become a little more palatable.
The first step in the creation of a Merkle tree is taking the transactions included in the block and puting them through the CHF to get their hash. After this, all these transaction hashes get paired with one another and get hashed again. This process happens continually until there is only one hash remaining: this is the Merkle root. By hashing in pairs, Merkle trees effectively make transactions tamper proof as any change to a previous transaction would travel up the tree changing each hash along the way — which would inevitably change the Merkle root (see pillar 4 of CHF). In order to verify a transaction, all you would have to do is to follow its branch on the tree.
I hope you now can understand the basis of cryptographic hash function, its relationship to the Merkle tree and Merkel root, as well as their overall connection to blockchain technology.
Oh, and hope you enjoyed this quick summary (and very transparent attempt to procrastinate any and all final exam efforts).
More to come.