A Hackers Guide to Layer 2: Zero Merkle Trees from Scratch

Carter Feldman
22 min readJun 15, 2023

--

This is the second weekly installment of A Hackers Guide to Layer 2: Building a zkVM from Scratch.

Follow @cmpeq on twitter to stay up to date with future installments.
Tutorial code snippets:
github.com/cf/zero-merkle-tree-tutorial

In this tutorial, we will:

  • Learn how to build an efficient zero merkle tree implementation capable of processing changes in trees with quadrillions of leaves, and implement it in TypeScript from scratch.
  • Learn how to build an append-only merkle tree which only requires O(log(n)) storage, and implement it in TypeScript from scratch.
  • Learn about Spiderman proofs and how to efficiently generate/verify batch updates to a merkle tree with a single, succinct proof.

Note

Before reading this tutorial, it is highly recommended to first work through the first installment where we cover merkle tree fundamentals, and we will be frequently referencing content from what we learned in the previous guide.

What I cannot create, I do not understand.
Richard Feynman

Introduction

In the previous installment of A Hackers Guide to Layer 2, we covered the basics of merkle trees and built a simple merkle tree database implementation.

While what we built was simple and usable for small trees, our previous merkle tree implementation will be very slow when working with large trees:

  • Every time we called getRoot (to compute the tree’s merkle root), we had to rehash the whole merkle tree (O(2^height) time complexity)
  • Creating a new merkle tree instance required us to pass the tree’s height and an array of leaves with size 2^height to the constructor (memory is also O(2^n))

Unfortunately, our zkVM will need to work with merkle trees that are both large (height > 32) and sparse (most of the leaves are zero).

We will explain why this is the case in future installments, but it is intuitively related to the fact that our zkVM will use merkle trees to verifiably store and organize persistent data (think of merkle trees as a verifiable hard drive + file system for our zkVM).

That being said, we are not out of luck, as we can take advantage of merkle tree’s self-similarity to build a much more efficient database for our use case.

The Zero Hashes

To begin tackling the problem of building a merkle tree better suited for our large/sparse use case, we can gain some insight by first examining an empty merkle tree and looking for patterns that we can take advantage of:

An Empty Merkle Tree (TreeHeight = 3)

If we recall that a node’s value is equal to the hash of its children, we can draw rectangles around the nodes which have the same value:

Since the nodes of the bottom level all have the same value, the nodes on the level above will also have the same value, since the nodes above have a value depedant only on the hash of the nodes below

Aha! When a merkle tree is empty, it looks like the all the nodes on each level have the same value. The value of the nodes on each level of an empty merkle tree are known as the Zero Hashes and are often denoted as Zₙ where n = TreeHeight - Level.

Since we know that the bottom nodes have a value Z₀ = 0, lets write an expression for Zₙ.

Recalling the formula for computing the value of node in a merkle tree that we learned in the previous installment:

The formula for computing the value of a merkle tree node

We can just plug-in zero for the leaves and use the level-wise self-similarity that we observed in the tree to write an expression for Zₙ:

Hence we can say that, for an empty merkle tree with height = TreeHeight:

Great, now let’s demonstrate our knowledge of the zero hashes by writing a TypeScript function that computes the array of zero hashes for a giventreeHeight:

Running this code, we get the output:

Great! Now that we have written a function to compute the zero hashes, we can write a function to obtain the value of any node in an empty merkle tree using our zero hashes:

function getNodeValue(treeHeight: number, level: number): MerkleNodeValue{
const zeroHashes = computeZeroHashes(treeHeight);
return zeroHashes[treeHeight-level];
}

Implementing a ZMT Database

We now know how to compute the value any node in an empty merkle tree, let’s use our knowledge to write our efficient merkle tree database.

In the previous installment of our series, we already proved that, when you update a leaf in a merkle tree, only the nodes alone the leaf’s merkle path will change:

Using this principle, we make a merkle tree database which only stores all the nodes in the tree that aren’t zero hashes, and follows the procedure:

To find the value of node N(level, index):

  • Check if it exists in the database. If it does, return the value.
  • If it does not exist, return the corresponding zero hash for the node’s level
  • If you want to update a node, first set the leaf, compute the leaf’s merkle path, and then save all the new node values in the merkle path to the database

In this way, we will create a database which only needs to store the non-zero hash nodes, perfect for our large, sparse merkle trees!

Let’s start by writing a simple data store which follows our rule for getting nodes from the datastore (return if it exists, else fallback to a zero hash):

class NodeStore {
nodes: {[id: string]: MerkleNodeValue};
height: number;
zeroHashes: MerkleNodeValue[];
constructor(height: number){
this.nodes = {};
this.height = height;
this.zeroHashes = computeZeroHashes(height);
}
contains(level: number, index: number): boolean {
// check if the node exists in the data store
return Object.hasOwnProperty.call(this.nodes, level+"_"+index);
}
set(level: number, index: number, value: MerkleNodeValue){
// set the value of the node in the data store
this.nodes[level+"_"+index] = value;
}
get(level: number, index: number): MerkleNodeValue {
if(this.contains(level, index)){
// if the node is in the datastore, return it
return this.nodes[level+"_"+index];
}else{
// if the node is NOT in the data store, return the correct zero hash for the node's level
return this.zeroHashes[this.height-level];
}
}
}

Great! Now we can write a ZeroMerkleTree class which uses the NodeStore:

class ZeroMerkleTree {
height: number;
nodeStore: NodeStore;
constructor(height: number){
this.height = height;
this.nodeStore = new NodeStore(height);
}
}

Following what we learned in the previous installment, we know that we can compute the merkle path of a node by dividing the index by 2 each time we go up a level in the merkle path:

function getMerklePathOfNode(level, index){
const merklePath = [];
for(let currentLevel=level;currentLevel>=0;currentLevel--){
merklePath.push({
level: currentLevel,
index: index,
});
index = Math.floor(index/2);
}
return merklePath;
}

Applying this knowledge, lets write a setLeaf function which sets a leaf in the tree and updates its merkle path according to the same logic as our getMerklePathOfNode function and the even-odd hashing rule for siblings:

class ZeroMerkleTree {
height: number;
nodeStore: NodeStore;
constructor(height: number){
this.height = height;
this.nodeStore = new NodeStore(height);
}
setLeaf(index: number, value: MerkleNodeValue){
// start traversing the leaf's merkle path at the leaf node
let currentIndex = index;
let currentValue = value;
// don't set the root (level = 0) in the loop, as it has no sibling
for(let level = this.height; level > 0; level--){
// set the current node in the tree
this.nodeStore.set(level, currentIndex, currentValue);
if(currentIndex % 2 === 0){
// if the current index is even, then it has a sibling on the right (same level, index = currentIndex+1)
const rightSibling = this.nodeStore.get(level, currentIndex+1);
currentValue = hash(currentValue, rightSibling);
}else{
// if the current index is odd, then it has a sibling on the left (same level, index = currentIndex-1)
const leftSibling = this.nodeStore.get(level, currentIndex-1);
currentValue = hash(leftSibling, currentValue);
}
// set current index to the index of the parent node
currentIndex = Math.floor(currentIndex/2);
}
// set the root node (level = 0, index = 0) to current value
this.nodeStore.set(0, 0, currentValue);
}
}

Great! Let’s check if its working by adding a getRoot function which returns the merkle root

class ZeroMerkleTree {
...
getRoot(): MerkleNodeValue {
return this.nodeStore.get(0, 0);
}
...
}

Putting everything together we get:

Running this function yields the result:

[example2] the root is: 7e286a6721a66675ea033a4dcdec5abbdc7d3c81580e2d6ded7433ed113b7737

We can see if we got everything right by comparing it with the merkle root we calculated from a tree with the same leaves in our previous installment:

The root matches!

A perfect match 🎉, now we can make it even more useful by returning a delta merkle proof whenever we call setLeaf.

Recalling our previous definition of delta merkle proofs we will need the following data:

  • index of the leaf (index)
  • the siblings of the leaf’s merkle path (siblings)
  • the root before the leaf was changed (oldRoot)
  • the value of the leaf before it was changed (oldValue)
  • the root after the leaf is changed (newRoot)
  • the new value of the leaf (newValue)
type MerkleNodeValue = string;
interface IDeltaMerkleProof {
index: number;
siblings: MerkleNodeValue[];
oldRoot: MerkleNodeValue;
oldValue: MerkleNodeValue;
newRoot: MerkleNodeValue;
newValue: MerkleNodeValue;
}

To do this, we can modify the start of our setLeaf function to record the value of the oldRoot and oldValue before we make our update + add a siblings array to keep track of the siblings of the leaf’s merkle path:

...
setLeaf(index: number, value: MerkleNodeValue): IDeltaMerkleProof{
// get the old root and old value for the delta merkle proof
const oldRoot = this.nodeStore.get(0, 0);
const oldValue = this.nodeStore.get(this.height, index);
// siblings array for delta merkle proof
const siblings: MerkleNodeValue[] = [];
...

Then, we can modify the body of the loop where we hash the node on the merkle path with its sibling to also push the siblings value to our siblings array:

...
setLeaf(index: number, value: MerkleNodeValue): IDeltaMerkleProof{
...
for(let level = this.height; level > 0; level--){
// set the current node in the tree
this.nodeStore.set(level, currentIndex, currentValue);

if(currentIndex % 2 === 0){
// if the current index is even, then it has a sibling on the right (same level, index = currentIndex+1)
const rightSibling = this.nodeStore.get(level, currentIndex+1);
currentValue = hash(currentValue, rightSibling);
// add the right sibling to the siblings array
siblings.push(rightSibling); // <---- HERE
}else{
// if the current index is odd, then it has a sibling on the left (same level, index = currentIndex-1)
const leftSibling = this.nodeStore.get(level, currentIndex-1);
currentValue = hash(leftSibling, currentValue);
// add the left sibling to the siblings array
siblings.push(leftSibling); // <---- HERE
}
// set current index to the index of the parent node
currentIndex = Math.floor(currentIndex/2);
}
...

Then at the end of the function, we just need to use our new variables to return a delta merkle proof at the end of our setLeaf function:

setLeaf(index: number, value: MerkleNodeValue): IDeltaMerkleProof{
// get the old root and old value for the delta merkle proof
const oldRoot = this.nodeStore.get(0, 0);
const oldValue = this.nodeStore.get(this.height, index);
// siblings array for delta merkle proof
const siblings: MerkleNodeValue[] = [];

// start traversing the leaf's merkle path at the leaf node
let currentIndex = index;
let currentValue = value;
// don't set the root (level = 0) in the loop, as it has no sibling
for(let level = this.height; level > 0; level--){
// set the current node in the tree
this.nodeStore.set(level, currentIndex, currentValue);
if(currentIndex % 2 === 0){
// if the current index is even, then it has a sibling on the right (same level, index = currentIndex+1)
const rightSibling = this.nodeStore.get(level, currentIndex+1);
currentValue = hash(currentValue, rightSibling);
// add the right sibling to the siblings array
siblings.push(rightSibling);
}else{
// if the current index is odd, then it has a sibling on the left (same level, index = currentIndex-1)
const leftSibling = this.nodeStore.get(level, currentIndex-1);
currentValue = hash(leftSibling, currentValue);
// add the left sibling to the siblings array
siblings.push(leftSibling);
}
// set current index to the index of the parent node
currentIndex = Math.floor(currentIndex/2);
}
// set the root node (level = 0, index = 0) to current value
this.nodeStore.set(0, 0, currentValue);
return { // <--- HERE
index,
siblings,
oldRoot,
oldValue,
newValue: value,
newRoot: currentValue,
};
}

Putting everything together, we can copy over our verifyDeltaMerkleProof from the previous installment to check to make sure the delta merkle proofs are valid:

Let’s run the javascript and see if it works:

[example3] delta merkle proof for index 0 is valid
[example3] delta merkle proof for index 1 is valid
[example3] delta merkle proof for index 2 is valid
[example3] delta merkle proof for index 3 is valid
[example3] delta merkle proof for index 4 is valid
[example3] delta merkle proof for index 5 is valid
[example3] delta merkle proof for index 6 is valid
[example3] delta merkle proof for index 7 is valid
[example3] the delta merkle proofs are:
[
{
"index": 0,
"siblings": [
"0000000000000000000000000000000000000000000000000000000000000000",
"f5a5fd42d16a20302798ef6ed309979b43003d2320d9f0e8ea9831a92759fb4b",
"db56114e00fdd4c1f85c892bf35ac9a89289aaecb1ebd0a96cde606a748b5d71"
],
"oldRoot": "c78009fdf07fc56a11f122370658a353aaa542ed63e44c4bc15ff4cd105ab33c",
"oldValue": "0000000000000000000000000000000000000000000000000000000000000000",
"newValue": "0000000000000000000000000000000000000000000000000000000000000001",
"newRoot": "f06e424318b067ae608de0ef0035e9f48a2658cc59e7f94f9f94600b2a36eac6"
},
... /* omitted for brevity, check codepen for full result */
]

Success! All of the delta merkle proofs are valid!

Chain Delta Merkle Proofs

Along with verifying delta merkle proofs, another important check that our zkVM will have to perform is known as a chain delta merkle proof verification.

A chain delta merkle proof verification ensures that a series of delta merkle proofs are sequential, meaning that you have provided all the proofs for the state transition from the first delta merkle proof’s oldRoot to the last delta merkle proof in the series’s newRoot.

We can perform this verification by asserting that, for each delta merkle proof in the series, the oldRoot of the current proof equals the newRoot of the previous proof:

...
for(let i=0;i<deltaMerkleProofs.length;i++){
const deltaProof = deltaMerkleProofs[i];
if(!verifyDeltaMerkleProof(deltaProof)){ //first verify the proof
console.error("[example4] ERROR: delta merkle proof for index "+deltaProof.index+" is INVALID");
throw new Error("invalid delta merkle proof");
}else if(i>0 && deltaProof.oldRoot !== deltaMerkleProofs[i-1].newRoot){ // <------ [HERE]
// the previous proof's new root should be the same as this proof's old root
console.error(
"[example4] ERROR: delta merkle proof for index "+deltaProof.index +
" has a different old root than the previous delta merkle proof's new root"
);
throw new Error("delta merkle proof root sequence mismatch");
}else{
console.log("[example4] delta merkle proof for index "+deltaProof.index+" is valid");
}
}
...

Let’s update our code to include a chain delta merkle proof verification and check to make sure our code is working properly:

Running example4, we get the result:

[example4] delta merkle proof for index 0 is valid
[example4] delta merkle proof for index 1 is valid
[example4] delta merkle proof for index 2 is valid
[example4] delta merkle proof for index 3 is valid
[example4] delta merkle proof for index 4 is valid
[example4] delta merkle proof for index 5 is valid
[example4] delta merkle proof for index 6 is valid
[example4] delta merkle proof for index 7 is valid
[example4] the delta merkle proofs are:
[
... /* omitted for brevity, check codepen for full result */
]

Great! No errors 🎉, we have proves the changes we made are valid (delta merkle proof) and sequential (chain delta merkle proof).

Finishing up our ZMT Database

The final import function we will want to implement is a getLeaf function which returns the value of a leaf and an accompanying merkle proof of its inclusion within the tree. To do this, we just need to grab the value of the leaf, and then walk its merkle path to obtain the siblings and root using the same strategy we used for computing the delta merkle proof in setLeaf:

class ZeroMerkleTree {
...
getLeaf(index: number): IMerkleProof {
// siblings array for merkle proof
const siblings: MerkleNodeValue[] = [];
// get value for the merkle proof
const value = this.nodeStore.get(this.height, index);

// start traversing the leaf's merkle path at the leaf node
let currentIndex = index;
let currentValue = value;
// don't set the root (level = 0) in the loop, as it has no sibling
for(let level = this.height; level > 0; level--){
if(currentIndex % 2 === 0){
// if the current index is even, then it has a sibling on the right (same level, index = currentIndex+1)
const rightSibling = this.nodeStore.get(level, currentIndex+1);
currentValue = hash(currentValue, rightSibling);
// add the right sibling to the siblings array
siblings.push(rightSibling);
}else{
// if the current index is odd, then it has a sibling on the left (same level, index = currentIndex-1)
const leftSibling = this.nodeStore.get(level, currentIndex-1);
currentValue = hash(leftSibling, currentValue);
// add the left sibling to the siblings array
siblings.push(leftSibling);
}
// set current index to the index of the parent node
currentIndex = Math.floor(currentIndex/2);
}
// current value is the root
const root = currentValue;
return {
root,
siblings,
index,
value,
};
}
...

We can now put everything together to and use the verifyMerkleProof function we wrote in the previous installment to verify the merkle proofs generated by getLeaf (full code gist):

/* see https://gist.github.com/cf/1b04459735c0b86dda56714205566c97 */
...
function example5(){
const leavesToSet = [
"0000000000000000000000000000000000000000000000000000000000000001", // 1
"0000000000000000000000000000000000000000000000000000000000000003", // 3
"0000000000000000000000000000000000000000000000000000000000000003", // 3
"0000000000000000000000000000000000000000000000000000000000000007", // 7
"0000000000000000000000000000000000000000000000000000000000000004", // 4
"0000000000000000000000000000000000000000000000000000000000000002", // 2
"0000000000000000000000000000000000000000000000000000000000000000", // 0
"0000000000000000000000000000000000000000000000000000000000000006", // 6
];
const tree = new ZeroMerkleTree(3);
const deltaMerkleProofs = leavesToSet.map((leaf, index)=>tree.setLeaf(index, leaf));
// verify the delta merkle proofs
for(let i=0;i<deltaMerkleProofs.length;i++){
const deltaProof = deltaMerkleProofs[i];
if(!verifyDeltaMerkleProof(deltaProof)){
console.error("[example5] ERROR: delta merkle proof for index "+deltaProof.index+" is INVALID");
throw new Error("invalid delta merkle proof");
}else if(i>0 && deltaProof.oldRoot !== deltaMerkleProofs[i-1].newRoot){
// the previous proof's new root should be the same as this proof's old root
console.error(
"[example5] ERROR: delta merkle proof for index "+deltaProof.index +
" has a different old root than the previous delta merkle proof's new root"
);
throw new Error("delta merkle proof root sequence mismatch");
}else{
console.log("[example5] delta merkle proof for index "+deltaProof.index+" is valid");
}
}
// don't print the delta merkle proofs to avoid clutter
//console.log("[example5] the delta merkle proofs are:\n"+JSON.stringify(deltaMerkleProofs, null, 2));
// verify each leaf's merkle proof
for(let i=0;i<leavesToSet.length;i++){
const proof = tree.getLeaf(i);
if(!verifyMerkleProof(proof)){
console.error("[example5] ERROR: merkle proof for index "+proof.index+" is INVALID");
throw new Error("invalid merkle proof");
}else if(proof.value !== leavesToSet[i]){
console.error("[example5] ERROR: merkle proof for index "+proof.index+" has the wrong value");
throw new Error("merkle proof value mismatch");
}else{
console.log("[example5] merkle proof for index "+proof.index+" is valid");
}
console.log("merkle proof for index "+proof.index+": "+JSON.stringify(proof, null, 2));
}
}
example5();

Running the code we get the result:

[example5] delta merkle proof for index 0 is valid
[example5] delta merkle proof for index 1 is valid
[example5] delta merkle proof for index 2 is valid
[example5] delta merkle proof for index 3 is valid
[example5] delta merkle proof for index 4 is valid
[example5] delta merkle proof for index 5 is valid
[example5] delta merkle proof for index 6 is valid
[example5] delta merkle proof for index 7 is valid
[example5] merkle proof for index 0 is valid
merkle proof for index 0: {
"root": "7e286a6721a66675ea033a4dcdec5abbdc7d3c81580e2d6ded7433ed113b7737",
"siblings": [
"0000000000000000000000000000000000000000000000000000000000000003",
"6b0e4bcd4368ba74e6a99ee69334c2593bcae1170d77048854d228664218c56b",
"81b1e323f0e91a785dfd155817e09949a7d66fe8fdc4f31f39530845e88ab63c"
],
"index": 0,
"value": "0000000000000000000000000000000000000000000000000000000000000001"
}
[example5] merkle proof for index 1 is valid
... /* excluded for brevity, check the codepen above for the full result */

Fantastic!

We have now written a merkle tree database which we can use with our zkVM 🎉🎉🎉

To show off our new super-powered merkle tree, lets create a merkle tree of height 50, which has 1,125,899,906,842,624 leaves, and set values for the 1337th and 999,999,999,999th node (link to code)

At 64 bytes of storage per leaf, our previous implementation would need 64 petabytes of RAM just to store the tree (not even modern super computers are capable of this), yet with our new ZeroMerkleTree implementation we can do this in the blink of an eye on laptop and only consume 6400 bytes:

// for the whole code check the gist at https://gist.github.com/cf/9df9b18cd5877fb4fc5c8b31a3acf77d
...
function example6(){
const tree = new ZeroMerkleTree(50);
const deltaA = tree.setLeaf(999999999999,"0000000000000000000000000000000000000000000000000000000000000008");
const deltaB = tree.setLeaf(1337,"0000000000000000000000000000000000000000000000000000000000000007");
const proofA = tree.getLeaf(999999999999);
const proofB = tree.getLeaf(1337);
console.log("verifyDeltaMerkleProof(deltaA): "+verifyDeltaMerkleProof(deltaA));
console.log("verifyDeltaMerkleProof(deltaB): "+verifyDeltaMerkleProof(deltaB));
console.log("deltaA.newRoot === deltaB.oldRoot: "+(deltaA.newRoot === deltaB.oldRoot));
console.log("verifyMerkleProof(proofA): "+verifyMerkleProof(proofA));
console.log("verifyMerkleProof(proofB): "+verifyMerkleProof(proofB));
console.log("proofA: "+JSON.stringify(proofA, null, 2));
console.log("proofB: "+JSON.stringify(proofB, null, 2));
}
example6();

Which prints the result:

verifyDeltaMerkleProof(deltaA): true
verifyDeltaMerkleProof(deltaB): true
deltaA.newRoot === deltaB.oldRoot: true
verifyMerkleProof(proofA): true
verifyMerkleProof(proofB): true
proofA: {
"root": "40db8b6edad868d911c8b9aea2692ee80b2e87ac407b8d1a5efe30419e843991",
"siblings":
... /* omitted for brevity, check the code pen for the full result */
],
"index": 999999999999,
"value": "0000000000000000000000000000000000000000000000000000000000000008"
}
proofB: {
"root": "40db8b6edad868d911c8b9aea2692ee80b2e87ac407b8d1a5efe30419e843991",
"siblings": [
... /* omitted for brevity, check the code pen for the full result */
],
"index": 1337,
"value": "0000000000000000000000000000000000000000000000000000000000000007"
}

We have climbed the impossible mountain, and built a merkle tree ready for our zkVM, all thanks to ZeroHashes and our ZMT 💪

Append Only Merkle Trees

Our ZMT implementation is perfect for our zkVM use case, but when we are building on legacy blockchains (EVM/zkEVM), storage can be expensive to the point where even our ZMT will prove too costly to make sense for use cases like verifiable event logs.

We will touch on verifiable event logs in more detail in a future installment where we explore trustless cross chain bridges, but for now it is sufficient to know that these write-only merkle trees are critical for trustlessly bridging data & assets from layer 1 to layer 2.

The Verifiable Event Log

The basic theory behind a verifiable event log is that we want to allow a smart contract to record events in the leaves of a merkle tree. Whenever a new event occurs, we append a leaf to the merkle tree (set the next non-zero leaf in the merkle tree to the hash of the event) and calculate the updated root. You can then process these events off-chain using zero knowledge proofs or a layer 2, and submit proofs which prove you have sync’d all the events up to the latest event log root, thus providing trustless data transfer from layer 1 to layer 2.

The most important property of use case is that the smart contract only needs to write leaves to the merkle tree and never has to read back the previous event leaves (we only really care about reading the latest root).

With our understanding of the zero hashes and some cleverness, it turns out that we will be able to build an append-only merkle tree of height N which only needs to store O(log(N)) hashes on chain!

As we did before, let’s get started by visualizing the changes in the merkle tree when we append leaves and see if we can find any patterns.

In the animation below, we visualize the delta merkle proof results from appending a leaf where:

  • the purple node represents the leaf being appending
  • the blue nodes represent the leaf’s merkle path
  • the red nodes represent the siblings

Recall that our goal is to be able to some how compute the new merkle root while storing the minimum amount of data possible. We also know from our previous installment that the information required to compute a merkle tree’s root is known as a merkle proof and contains the following information

  • index (index of the leaf)
  • value (value of the leaf)
  • siblings (siblings of the leaf’s merkle path)

In the case of our append only merkle tree, the index and value will be passed to the function, so the only thing we would be missing for our calculation is a method for calculating the siblings for the next merkle proof using the data stored in our contract. If we used the ZMT implementation from the previous section, we would have to store all the nodes in the tree on-chain, which is undesirable.

One way we might go about tackling the problem is storing the previous node’s merkle proof, and somehow writing a function which computes the new siblings from the previous nodes merkle proof.

If we could do that, we could accomplish our goal of O(log(n)) storage by repeating the procedure below each time we append a leaf to the tree:

  1. Read the previous merkle proof from the smart contract storage, and increment the index by 1
  2. Somehow calculate the new siblings for the leaf we are appending using the zero hashes and the previous merkle proof
  3. Compute the new merkle root using the new siblings and the value we are appending, and then overwrite the old merkle proof with the newly appended leaf’s merkle proof to the smart contract storage

This of course begs the question: is it possible to compute the siblings of the next leaf in our append only merkle tree using only the zero hashes and previous merkle proof?

Let’s examine the animation once again, but this time let’s pay attention to how the siblings change as each leaf is appended to the tree. We will still color all the sibling nodes red, but this time we will make the sibling’s text yellow if it was not a sibling for the previous node’s merkle proof:

Starting to see the pattern? The key to cracking this puzzle is remembering that the siblings are just the adjacent nodes to the leaf’s merkle path, so when transitioning from leaf N to leaf N+1:

  • On the levels where the merkle paths of leaf N and leaf N+1 are the same, you can reuse the sibling from leaf N’s merkle proof in leaf N+1’s merkle proof
  • On the levels where the merkle paths of leaf N and leaf N+1 are the different, you need to somehow compute what new sibling is.

Great! So for the levels where the merkle path doesn’t change, we know to just reuse the sibling from the previous leaf’s proof, but what about the second case?

Let’s examine some more append transition to get some intuition:

Notice that, because it is an append-only merkle tree, our leaf index is always increasing, and therefore our path is always moving to the right.

Let’s write down what we know about the relationship between Leaf[n] and Leaf[n+1]’s merkle paths/siblings on a tree with height H and then try to deduce a formula from the facts we know:

Sorry for the image embed, medium doesn’t support LaTeX so I can’t inline this

Aha! So the only information we will need to generate a merkle proof for Leaf[n+1] is:

  • the zero hashes for the tree
  • Leaf[n]’s siblings/value
  • Leaf[n]’s merkle path (we can compute this using the siblings and value!)
  • Leaf[n+1]’s value

This is very cool, it looks like all we need to store when building our append only merkle tree is latest leaf’s merkle proof!

Let’s write our implementation based on the facts we wrote down above:

class AppendOnlyMerkleTree {
height: number;
lastProof: IMerkleProof;
zeroHashes: MerkleNodeValue[];
constructor(height: number){
this.height = height;
this.zeroHashes = computeZeroHashes(height);
// create a dummy proof of all zero hashes for initialization (before we append any leaves, we know all the siblings will be zero hashes because it is an empty tree)
this.lastProof = {
root: this.zeroHashes[this.height],
siblings: this.zeroHashes.slice(0, this.height),
index: -1,
value: this.zeroHashes[this.height],
};
}
appendLeaf(leafValue: string): IDeltaMerkleProof {
const oldMerklePath = computeMerklePathFromProof(this.lastProof.siblings, this.lastProof.index, this.lastProof.value);
// get the old root and old value for the delta merkle proof
const oldRoot = this.lastProof.root;
// the old value will always be empty, thats why its an append only tree :P
const oldValue = "0000000000000000000000000000000000000000000000000000000000000000";
const prevIndex = this.lastProof.index;
// append only tree = new index is always the previous index + 1
const newIndex = prevIndex+1;
// keep track of the old siblings so we can use them for our delta merkle proof
const oldSiblings =this.lastProof.siblings;
const siblings: MerkleNodeValue[] = [];
let multiplier = 1;
for(let level=0;level<this.height;level++){
// get the index of the previous leaf's merkle path node on the current level
const prevLevelIndex = Math.floor(prevIndex/multiplier);
// get the index of the new leaf's merkle path node on the current level
const newLevelIndex = Math.floor(newIndex/multiplier);

if(newLevelIndex===prevLevelIndex){
// if the merkle path node index on this level DID NOT change, we can reuse the old sibling
siblings.push(oldSiblings[level]);
}else{
// if the merkle path node index on this level DID change, we need to check if the new merkle path node index is a left or right hand node
if(newLevelIndex%2===0){
// if the new merkle path node index is even, the new merkle path node is a left hand node,
// so merkle path node's sibling is a right hand node,
// therefore our sibling has an index greater than our merkle path node,
// so the sibling must be a zero hash
// QED
siblings.push(this.zeroHashes[level]);
}else{
// if the new merkle path node is odd, then its sibling has an index one less than it, so its sibling must be the previous merkle path node on this level
siblings.push(oldMerklePath[level]);
}
}
multiplier = multiplier * 2;
}
const newRoot = computeMerkleRootFromProof(siblings, newIndex, leafValue);
this.lastProof = {
root: newRoot,
siblings: siblings,
index: newIndex,
value: leafValue,
};
return {
index: this.lastProof.index,
siblings,
oldRoot,
oldValue,
newRoot,
newValue: leafValue,
};
}
}

Looks good, let’s test it out using our previous verify delta merkle proof function to make sure it is working correctly:

When we run example7, we get the output:

verifyDeltaMerkleProof(deltaA): true
verifyDeltaMerkleProof(deltaB): true
deltaA.newRoot === deltaB.oldRoot: true
deltaA: {
"index": 0,
"siblings": [
... /* omitted for brevity, check the code pen for the full result */
],
"oldRoot": "e833d7a67160e68bf4c9044a53077df2727ad00cf36f4949c7b681a912140cbb",
"oldValue": "0000000000000000000000000000000000000000000000000000000000000000",
"newRoot": "bfe0338f3c07c1ff64514ccde5d0e4535b88c9093454b29de1590414c721abb1",
"newValue": "0000000000000000000000000000000000000000000000000000000000000008"
}
deltaB: {
"index": 1,
"siblings": [
... /* omitted for brevity, check the code pen for the full result */
],
"oldRoot": "bfe0338f3c07c1ff64514ccde5d0e4535b88c9093454b29de1590414c721abb1",
"oldValue": "0000000000000000000000000000000000000000000000000000000000000000",
"newRoot": "3b6c4c5cf467972101c5236a32eb2f5e23c66fab942352d1e7003660f83f66b2",
"newValue": "0000000000000000000000000000000000000000000000000000000000000007"
}
verifyDeltaMerkleProof(delta[0]): true
verifyDeltaMerkleProof(delta[1]): true
verifyDeltaMerkleProof(delta[2]): true
verifyDeltaMerkleProof(delta[3]): true
verifyDeltaMerkleProof(delta[4]): true
verifyDeltaMerkleProof(delta[5]): true
verifyDeltaMerkleProof(delta[6]): true
verifyDeltaMerkleProof(delta[7]): true
verifyDeltaMerkleProof(delta[8]): true
verifyDeltaMerkleProof(delta[9]): true
verifyDeltaMerkleProof(delta[10]): true
verifyDeltaMerkleProof(delta[11]): true
verifyDeltaMerkleProof(delta[12]): true
verifyDeltaMerkleProof(delta[13]): true
verifyDeltaMerkleProof(delta[14]): true
verifyDeltaMerkleProof(delta[15]): true
verifyDeltaMerkleProof(delta[16]): true
verifyDeltaMerkleProof(delta[17]): true
verifyDeltaMerkleProof(delta[18]): true
verifyDeltaMerkleProof(delta[19]): true
verifyDeltaMerkleProof(delta[20]): true
verifyDeltaMerkleProof(delta[21]): true
verifyDeltaMerkleProof(delta[22]): true
verifyDeltaMerkleProof(delta[23]): true
verifyDeltaMerkleProof(delta[24]): true
verifyDeltaMerkleProof(delta[25]): true
verifyDeltaMerkleProof(delta[26]): true
verifyDeltaMerkleProof(delta[27]): true
verifyDeltaMerkleProof(delta[28]): true
verifyDeltaMerkleProof(delta[29]): true
verifyDeltaMerkleProof(delta[30]): true
verifyDeltaMerkleProof(delta[31]): true
verifyDeltaMerkleProof(delta[32]): true
verifyDeltaMerkleProof(delta[33]): true
verifyDeltaMerkleProof(delta[34]): true
verifyDeltaMerkleProof(delta[35]): true
verifyDeltaMerkleProof(delta[36]): true
verifyDeltaMerkleProof(delta[37]): true
verifyDeltaMerkleProof(delta[38]): true
verifyDeltaMerkleProof(delta[39]): true
verifyDeltaMerkleProof(delta[40]): true
verifyDeltaMerkleProof(delta[41]): true
verifyDeltaMerkleProof(delta[42]): true
verifyDeltaMerkleProof(delta[43]): true
verifyDeltaMerkleProof(delta[44]): true
verifyDeltaMerkleProof(delta[45]): true
verifyDeltaMerkleProof(delta[46]): true
verifyDeltaMerkleProof(delta[47]): true
verifyDeltaMerkleProof(delta[48]): true
verifyDeltaMerkleProof(delta[49]): true

Success!

Since we are only storing one merkle proof in our class, we only need O(log(n)) storage for our append only merkle tree 🎉

Spiderman Proofs

Sometimes we want to update whole sections of a tree:

Using our ZMT implementation, we can already do this by updating each leaf, one-by-one, but perhaps there is a better way to go about updating batches of leaves.

Let’s first try the tree highlighted in red on the the left:

We want to replace the top sub tree with the bottom tree

Notice that the sub tree with root N(1,0) in our big tree looks the same as our updated smaller tree on the bottom, except with different values.

If we also look carefully at the big tree on top, the nodes above the sub-tree we want to modify form a tiny height = 1 tree of their own:

Great! We know how to update the leaves of a merkle tree, let’s start building a delta merkle proof from the information we know.

  • We know that the node we want to update is N(1,0), so the index of the delta merkle proof will be 1.
  • We know that the siblings of the delta merkle proof will be [N(1,1)] because N(1,0) is the only node on the merkle path, and its sibling is N(1,1)
  • We know that the old root of the delta merkle proof will be our old N(0,0)
  • We know that our old value is the value of N(1,0) because that’s the “leaf” we want to update

Now all we have to do is update the leaf, and apply the rules for delta merkle proofs that we learned in the previous installment:

Fantastic! We now know how to efficiently replace sections of our merkle tree and generate delta merkle proofs to prove the validity of our changes.

For the rest of the sub-trees we want to replace, we can just update them in succession using the same rules:

For now, we won’t write the code in TypeScript for this operation, as we will only need to write this logic in our arithmetic circuit for reasons we will explore in a future installment.

Conclusion

Great work if you have made it this far, you now know all the information about merkle trees that will be needed for constructing our zkVM!

In our next installment, we will write our first zero knowledge circuits using plonky2, and begin implementing our zkVM!

Thanks to our sponsors OAS & QED, you can follow OAS on twitter @OASNetwork

about the author — follow @cmpeq on twitter or check out my about page.

--

--

Carter Feldman

reformed hacker, security/zk researcher and founder of the qed protocol