MMRs : PoW Solutions to The Super-Light Verification of Cross-chain
In the <Poly Network Intelligence> episode 1 “The reluctant information due to Merkle tree”, the Poly team introduced the basic structure of Merkle trees and the rules for generating parent nodes. When faced with a situation where the number of leaf nodes does not meet the requirement of a full binary tree, Satoshi Nakamoto’s approach is to directly copy the last node to force a full binary tree. Although it seems not that elegant, it really solved the problem and introduced the unique ID of UTXO which effectively avoid double-spending attacks.
Merkle tree was originally invented to make a commitment for a large amount of data, so that even if the data is split into several parts and saved, the correctness of the final combined one can be overall confirmed by Merkle tree. In another word, if a certain part of the data is modified, the synthesized data will not end up with the same root hash as before.
Not surprisingly, the ability to use Merkle tree requires that all the data is known in advance. But in the blockchain domain, where every chain is generating new blocks all the time. Bitcoin core developer Peter Todd ran into a similar problem in 2012. When it came to finding the root hash for a constantly growing pile of data, he was urgent to find a way to add new leaf nodes to the Merkle tree while keeping the process of computing the root as simple as possible. For this reason, he invented Merkle Mountain Ranges (aka. MMR) — — a structure designed to find the root hash of the whole of the constantly growing data. In this article, the Poly team will introduce Merkle Mountain Ranges and its application.
The Necessity of MMRs
In order to explain the necessity of MMR, we need to review the disadvantages of Merkel trees. The use of Merkle trees presupposes that all data is known. Similarly, in order to declare all transactions within a block, all transactions to be declared must be known in advance. Once the SPV node obtains the block header, it is informed of the declaration of all recorded transactions within the block, which is the Merkel root.
At this point, even if the SPV node questions different full nodes about the different transactions recorded in the block, it still have confidence that its conclusion based on the evidence is equally valid. With Merkle roots, the integrity of transactions from different nodes is committed, which prevents the SPV node from depending on a single full node, thus avoiding the failure risk of a single point.
Merkle roots can provide an integrity commitment to “static data”, which means that the data does not increase or decrease or be modified. As an example, a finished movie can be considered as static data. When a user downloads a movie through a P2P network, he/she first obtains the Merkle root of that movie file from a trusted node, and then downloads different pieces of that movie from other unfamiliar nodes and completes the combination locally. As long as the Merkle root of the combined file remains the same as the root provided by the trusted node, the combined file can be confirmed to be complete and correct.
However, in the blockchain domain, “static data” is not common, and each chain always keeps generating new blocks, i.e., new content is constantly added to the chain. At this point, SPV nodes choose to synchronize only headers of the blockchain in order to alleviate the storage requirements, and verify the correctness of transactions within the block through the Merkle root.
With the booming blockchain industry, there are increasing numbers of blockchain projects applied to different scenarios. In order to support cross-chain transactions between different blockchains, more and more header chain of different blockchains need to be saved, and the storage pressure faced by the verification nodes will become quite more intense.
Thus, in a paper published in the International Association for Cryptographic Research (IACR, the top association in the field of cryptography), Stanford PhD student Benedikt et al. proposed a novel block header structure in which a field is used to save the root hash of the Merkel Mountain Range, and all the block information before the latest block can be committed by the MMR root hash. In this way, the verification node eventually needs to save just the latest block of a certain blockchain, which can almost subvert the dilemma faced by the SPV nodes.
Next, let’s introduce the Merkle Mountain Ranges based on Merkle’s tree.
What are MMRs?
Peter Todd invented the Merkel Mountains Ranges with the following idea: since sometimes there is a situation where a full binary tree cannot be built, it could be simply split into several full binary trees (aka. mountains) of lower height, and then the final root is calculated through the peaks. The rule of parent node generation is like this: when there are 2 sibling nodes at the same height, generate 1 parent node. And the bottom nodes with height 0, i.e., represent the new added data. The structure of an MMR with 7 leaves is shown in the following figure:
Starting from 1, the nodes are numbered in the order in which they come out. We name each full binary tree as a mountain, and the node at the top of the mountain is the peak. Finally, the root hash of MMR is calculated in the order from the lowest peak to the highest peak. In the above example, let the lowest peak be P11, the second lowest peak be P10, and the highest peak be P7, then the root hash is as follows:
root = Hash(Hash(P11+P10) + P7)
When one more node (number 12) comes, the new MMR structure with the number of leaves 8 is shown below:
Now the root hash is the highest peak P15.
root = P15
If you want to prove that a leaf node exists in a MMR, the rules are simple.
1. construct the merkle proof of the subtree from the leaf node to the peak
2. bagging the right peaks, calculate its MMR root, and add the result to the proof
3. add the left peaks to the proof in the right-to-left order
As long as it is verified that the node can finally compute the same MMR root according to the proof, it is proved to be valid.
Applications of MMRs
The MMR structure will change dynamically as the number of leaves grows, naturally adapting to the characteristics of blockchain. Assuming all the blocks before the latest block as leaf nodes, the root hash of the MMR structure it constitutes is kept at the header of the latest block. This means that each block actually includs some information of all previous blocks. Thus the validation node can determine whether the latest block is valid or not based on the MMR root hash. The validation process is described below.
Suppose when the verifying node receives a latest block, in order to prove that the block does belong to the latest block of the longest honest chain, the verifying node will continue to query the proving node for several previous blocks (i.e., leaves), and the proving node must provide MMR evidence about the queried blocks, which is used to prove that queried block does belong to the same chain as the latest block. A malicious node cannot deceive the verifying node by preparing blocks on the honest longest chain in advance because the malicious node cannot generate MMR evidence belonging to blocks on an honest fork.
Only if all the queried blocks can be provided with correct MMR evidence, the validity of that blocks (e.g., correct PoW) is verified, so that the verifying node will choose to believe that the latest block it received indeed belongs to the latest block on the honest longest chain.
Benedikt et cl describe a sampling algorithm inside the paper that is able to hit invalid blocks with a very high probability, thus concluding that the latest received block does not belong to the honest longest chain.
Applying the MMR to the blockchain requires modifying the data structure in the block header, which means a hard fork. Due to autonomy situation of the Bitcoin community, it is quite difficult to reach such a consensus. For now, although the possibility of MMR gaining unanimous community acceptance and support is not yet high, the need for convenient and efficient block information verification will one day raise the interest in MMR due to the increasing number of cross-chain transactions that the blockchain industry will face.
1. FlyClient: Super-Light Clients for Cryptocurrencies
2. Merkle Mountain Ranges, https://github.com/opentimestamps/opentimestamps-server/blob/master/doc/merkle-mountain-range.md