A Simple Explanation of Merkle Trees

Alexandre Laplante
2 min readNov 3, 2018

--

Suppose we’re downloading a torrent. We got the torrent file or magnet link from a trusted source, and we go off to the internet to get pieces of the file we’re torrenting from random untrusted strangers.

If the original torrent file has a hash of the larger file we’re downloading, then once we’re done downloading all pieces of the file, we can confirm whether all the pieces we got were in fact valid. However, we can’t tell which piece was incorrect if it turns out our file does not hash to the expected value. The best we can do is throw away our entire file and try again, hoping we don’t get any incorrect pieces this time!

Enter Merkle trees.

Suppose the hash in the torrent file we receive is not a direct hash of the file being distributed, but instead this is how it was generated:

  • The file is partitioned into lots of little pieces.
  • The hash of each piece is calculated.
  • These hashes are used as the leaves of a tree where every node contains a hash of the concatenation of all its children.
  • The root hash of this tree is what is distributed.

Now if we download the root hash from a trusted source, we can go out and get its direct children in the Merkle tree (HA and HB in the diagram) from untrusted sources. We can quickly verify that these newly downloaded hashes are correct by concatenating them, taking their hash, and confirming that it’s equal to our root hash. We can repeat this process to download the entire Merkle tree from untrusted sources.

Once we have the Merkle tree, we have a hash of each of the individual pieces that we’re going to go out and download (H1 to H8 in the diagram). This means we can easily tell exactly which pieces are correct and which are incorrect.

--

--

Alexandre Laplante

Co-Founder of Passthrough (We’re hiring!). ex-Google, ex-Carta.