Why for CodeChain’s transactions_root and invoices_root, which require data validation, Skewed Merkle trees are a more appropriate solution.
As explained in the previous article, Ethereum has 4 different usages of the Modified Merkle Patricia Trie. Among these, State Trie and Storage Trie are used to efficiently store and verify changing data. However, the Transaction Trie and Receipts Trie do not exist to support changing data. These two trie are volatile data used only for the verification of the transactions used in one block and the receipt generated from that block.
The reason Ethereum uses identical data structure for two different objectives is to share identical implementations. However, since the objective is different, the optimized code to create a Transaction Trie or a Receipts Trie is different from code that creates a State Trie or a Storage Trie. In reality, Parity uses different code for each objective. If that’s the case, then wouldn’t there be no reason to use identical Trie for a Transaction Trie or a Receipts Trie?
This kind of consideration is reflected upon when looking at CodeChain’s transactions_root and invoices_root, which is being developed at KodeBox. Thus, we decided to take a different approach from a Merkle Patricia Trie that stores UTXO. Then what kind of properties would transactions_root and invoices_root need?
The minimum requirement would be that it needs to verify data. This verification should not simply verify data sets but also check whether data order is in agreement. This is what you need in a typical blockchain, including Ethereum. In CodeChain, there are additional incorporated requirements.
Firstly, implementation should be simple. By simple, it means that not only the code has to be simple, but it should also require less memory usage and less time to execute. All of this is mentioned because CodeChain is keeping light clients into consideration.
Next, transactions should be entirely verifiable. This property only exists in CodeChain, and not in any other existing blockchain technology. In order for CodeChain to synchronize, it uses a technique called Snapshot Sync. Snapshot Sync is a way to ensure that only one reliable header is required for a block to synchronize as if it has received the entire history. For this, all transactions must be verifiable with only a single header.
Lastly, it must be incremental. Since CodeChain’s transactions_root must be able to verify the entire transaction list, if it’s not incremental, then as blocks stack on and on, it would require a longer time to verify. In order to prevent this delay, the incremental property is crucial.
In order to satisfy these conditions, Ethereum’s Merkle Patricia Trie was the first to be excluded. This can make the entire data incremental, but the overhead of constructing the Trie was too large.
Then we considered the traditional Merkle Tree. However, this did not satisfy the requirement that the overhead should be small. It also did not satisfy the incremental property.
Thus, we decided to use the Skewed Merkle Tree which leans toward the left. In a Skewed Merkle Tree, you can create a new tree using the transactions of the current block on the root of the entire block, and quickly and lightly create a hash that contains a list of all transactions executed so far. That is, you can make the entire transaction list incremental, and the time and memory it takes to create is linearly proportional to the number of transactions. Furthermore, since we are only interested in root values for verification, we can optimize the used memory to use an amount times n(where n is a constant number).
Of course, the Skewed Merkle Tree is not good at everything. The Skewed Merkle Tree is inefficient in creating the Merkle path, a basic feature of the Merkle Tree. Thus, if you need to create a Merkle path, the classic balanced Merkle Tree is more appropriate. However, given the use of CodeChain’s transactions_root and invoices_root, the Merkle-proof path is not used. We decided that the Skewed Merkle Tree is more appropriate because it was used solely for data validation.
(For the original Korean version, click here)