# Ethereum 2.0 Deposit Merkle Tree

--

# Explainer & Practical Implementation Guide

## What is a sparse Merkle tree?

Sparse Merkle tree is a strategy of calculating a Merkle root in a reduced memory space. The computation is constrained but by a clever algorithm that allows the root calculation to be a single contiguous calculation. Overall computation necessary for root calculation is increased but, it is an incremental root calculation distributed across multiple `deposit`

method invocations.

For a more academic explanation of a sparse Merkle tree read the Efficient Sparse Merkle Trees white paper.

## Why a sparse Merkle tree for Ethereum 2.0 deposits?

Memory is reduced by using a single array with a size the same as the tree order, in the case of the Ethereum 2.0 deposit tree it is a 32 element array. A traditional Merkle tree calculation of size 2³² SHA2 hashes would require `360,777,252,864`

gas far exceeding the current gas limit of `8,000,000`

for a single block (storage costs not included). Calculations demonstrated with the following values below:¹

`-- SHA256BASE 60 Base cost of SHA256.`

-- SHA256WORD 12 Cost of SHA256 per word.

**²³²(60+12+12)=360,777,252,864**

To be fair the previous calculation is a trivial implementation that doesn’t incorporate the efficiencies of empty leaves. Regardless, each depositor would need to pay gas cost that grows exponentially at each increment of the total deposit count.

By calculating the root incrementally as described in the previous section the sparse tree algorithm allows for the gas costs to be distributed equitably among the depositors. Clean storage is only requested once at contract deployment and does not grow as the size of the tree increases.

## How does a sparse Merkle tree work?

A sparse Merkle tree relies on two key principles. The first being **zero hashes **and the second is that leafs are **inserted in numeric order by index**. Below is a diagram of zero hash tree.

## Zero Hashes

`zerohashes[0]='0x000...'=0x0`

zerohashes[1]='0xf5a...'=H(0x0,0x0)

zerohashes[2]='0xdb5...'=H(H(0x0,0x0),H(0x0,0x0))

zerohashes[3]='0xc78...'=H(H(H(0x0,0x0),H(0x0,0x0)),H(H(0x0,0x0),H(0x0,0x0)))

Zero hashes are the default value null value in the tree at a particular level. In this particular instance (an empty tree) the root is calculated as the hash of `zerohashes[3]`

.

root =H(H(H(H(0x0,0x0),H(0x0,0x0)),H(H(0x0,0x0),H(0x0,0x0))),

H(H(H(0x0,0x0),H(0x0,0x0)),H(H(0x0,0x0),H(0x0,0x0))))root =H(zerohashes[3],zerohashes[3])

Below is an example of inserting out first leaf. For example, if a leaf of value `0xA`

were to be inserted into index `0`

the proof path labeled in pink would be the only values recalculated.

root =H(H(H(0xA,0x0),H(0xA,0x0)),H(H(0xA,0x0),H(0xA,0x0))),

H(H(H(0x0,0x0),H(0x0,0x0)),H(H(0x0,0x0),H(0x0,0x0))))root =H(H(H(0xA,0x0),H(0xA,0x0)),H(H(0xA,0x0),H(0xA,0x0)), zerohashes[3])

The zero hashes are used to rebalance the tree where `n%2!=0`

at any particular order. An observation to make is that the zero hash values are standing in to allow a leaf or node to be hashed when there isn’t a value to pair with.

Next, if the an example of how the `validator_registration.v.py`

contract from the Ethereum 2.0 specification generates zero hashes.

The implementation of `__init__`

allows only **32** elements to be stored in the array `zerohashes`

. Notice from the example above that zero hashes remain the same across nodes comprised of zero hashes. This is the first part of the strategy for reducing memory with a sparse Merkle tree.

## Insert By Index

As mentioned previously, the second principle to reducing memory size in a sparse Merkle tree is to insert leaves in numeric order of index.

In the `validator_registration.v.py`

deposit contract the `zerohashes[]`

array is then copied over to another array named `branches[]`

of the same size. The `branches[]`

array is data structure that stores a single proof path for the sparse Merkle tree of the most recently inserted leaf. Each time a new leaf is inserted into the tree `branches[]`

array is updated to reflect the proof path of that leaf.

The first step when the `deposit`

method of the contract is called is to calculate `i`

. Example below:

The branch index `i`

will be used to calculate the depth of the tree to be hashed. The deposit parameters are hashed into a variable named `value`

and inserted into the `branch[]`

array.

In this previous gist it can be seen that the `value`

is hashed with `branch[j]`

and the final calculated value is inserted at `branch[i]`

. Using this insertion algorithm allows `branches[]`

to maintain a proof path of the most recent insertion.

Using the `get_deposit_root`

method the contract is able to calculate a `root`

from the `branches[]`

array once the deposit threshold has been met. The `root`

calculation will match the standard calculation described in the implementation section.

To learn more about how the sparse Merkle tree is implemented in the deposit contract read the formal verification of the deposit contract.

## Client Side Merkle Proof Path Calculation

Since the client needs to calculate a proof path for each individual `deposit`

for the root found in the `Eth2Genesis`

event, calculation on the client side is not computed as a sparse Merkle tree. There are several approaches that can be used to calculate a Merkle tree proof path; trading off between the coefficients of memory, computation, and storage.

Proof calculation must satisfy the `verify_merkle_branch`

method documented in the Ethereum 2.0 specification repository. `proof`

the proof path, an array of values that will be hashed together with the value`leaf`

to prove the inclusion of the `leaf`

in the `Eth2Genesis`

root.

Artemis’ implementation of the same `verify_merkle_branch`

method can be seen below

Below you can find the implementation of the `MerkleTree`

class from the Artemis client, afterwards there will be a break down of each of the methods.

The constructor method is invoked when an instance of the `MerkleTree`

is created. The parameter`height`

provided is used to create an array for each level of the tree. Additionally, the `zeroHashes`

array is created.

The `generateZeroHashes`

method is used to fill the `zeroHashes`

array. The `generateZeroHashes`

method works identical to the method `__init__`

in the `validator_registration.v.py`

deposit contract.

Below is an example of the `zeroHashes`

array when implemented properly for a Merkle tree with a height of 32.

`[ '0x0000000000000000000000000000000000000000000000000000000000000000',`

'0xf5a5fd42d16a20302798ef6ed309979b43003d2320d9f0e8ea9831a92759fb4b',

'0xdb56114e00fdd4c1f85c892bf35ac9a89289aaecb1ebd0a96cde606a748b5d71',

'0xc78009fdf07fc56a11f122370658a353aaa542ed63e44c4bc15ff4cd105ab33c',

'0x536d98837f2dd165a55d5eeae91485954472d56f246df256bf3cae19352a123c',

'0x9efde052aa15429fae05bad4d0b1d7c64da64d03d7a1854a588c2cb8430c0d30',

'0xd88ddfeed400a8755596b21942c1497e114c302e6118290f91e6772976041fa1',

'0x87eb0ddba57e35f6d286673802a4af5975e22506c7cf4c64bb6be5ee11527f2c',

'0x26846476fd5fc54a5d43385167c95144f2643f533cc85bb9d16b782f8d7db193',

'0x506d86582d252405b840018792cad2bf1259f1ef5aa5f887e13cb2f0094f51e1',

'0xffff0ad7e659772f9534c195c815efc4014ef1e1daed4404c06385d11192e92b',

'0x6cf04127db05441cd833107a52be852868890e4317e6a02ab47683aa75964220',

'0xb7d05f875f140027ef5118a2247bbb84ce8f2f0f1123623085daf7960c329f5f',

'0xdf6af5f5bbdb6be9ef8aa618e4bf8073960867171e29676f8b284dea6a08a85e',

'0xb58d900f5e182e3c50ef74969ea16c7726c549757cc23523c369587da7293784',

'0xd49a7502ffcfb0340b1d7885688500ca308161a7f96b62df9d083b71fcc8f2bb',

'0x8fe6b1689256c0d385f42f5bbe2027a22c1996e110ba97c171d3e5948de92beb',

'0x8d0d63c39ebade8509e0ae3c9c3876fb5fa112be18f905ecacfecb92057603ab',

'0x95eec8b2e541cad4e91de38385f2e046619f54496c2382cb6cacd5b98c26f5a4',

'0xf893e908917775b62bff23294dbbe3a1cd8e6cc1c35b4801887b646a6f81f17f',

'0xcddba7b592e3133393c16194fac7431abf2f5485ed711db282183c819e08ebaa',

'0x8a8d7fe3af8caa085a7639a832001457dfb9128a8061142ad0335629ff23ff9c',

'0xfeb3c337d7a51a6fbf00b9e34c52e1c9195c969bd4e7a0bfd51d5c5bed9c1167',

'0xe71f0aa83cc32edfbefa9f4d3e0174ca85182eec9f3a09f6a6c0df6377a510d7',

'0x31206fa80a50bb6abe29085058f16212212a60eec8f049fecb92d8c8e0a84bc0',

'0x21352bfecbeddde993839f614c3dac0a3ee37543f9b412b16199dc158e23b544',

'0x619e312724bb6d7c3153ed9de791d764a366b389af13c58bf8a8d90481a46765',

'0x7cdd2986268250628d0c10e385c58c6191e6fbe05191bcc04f133f2cea72c1c4',

'0x848930bd7ba8cac54661072113fb278869e07bb8587f91392933374d017bcbe1',

'0x8869ff2c22b28cc10510d9853292803328be4fb0e80495e8bb8d271f5b889636',

'0xb5fe28e79f1b850f8658246ce9b6a1e7b49fc06db7143e8fe0b4f2b0c5523a5c',

'0x985e929f70af28d0bdd1a90a808f977f597c7c778c489e98d3bd8910d31ac0f7' ]

Next, the `add`

method will be used to add new leaves to the `MerkleTree`

. Each time a new leaf is added the `MerkleTree`

instance is marked as `dirty`

. Doing so allows the instance to know when it is necessary to recompute branch values.

Below is an implementation used to calculate all the branch and root values. The algorithm is iterating through each level of the tree. Each iteration has an array`child`

(*values to be hashed)* and an array`parent`

(*place to store the hashed values). During the *last iteration the`parent`

array is the root of the `MerkleTree`

. If the `child`

level is unbalanced the statement `Bytes32 rightNode = (j + 1 < child.size()) ? child.get(j + 1) : zeroHashes.get(i)`

replaces the `rightNode`

to be replaced with an element from `zeroHashes`

of the same order. The index to store the resulting hash in `parent`

is calculated using `j/2`

.

Once the branches have been calculated the value `dirty`

is set to `false`

to indicated the `MerkleTree`

instance has an up to date calculation of the branch values.

Now to generate the proof path the method `getProofTreeByIndex`

is used to iterate through the `tree`

object and get the proof values pertinent to the deposit index provided.The algorithm employed iterates through each order of the `tree`

. The `index`

value is non-final and is divided by two at each iteration. Initially if the value of `index`

is odd `index`

is decremented, otherwise increment using the statement `index = index % 2 == 1 ? index — 1 : index + 1`

.

A proof path of the same size as the `MerkleTree`

`height`

will be created and returned as the array `proof`

.