Semi-Stateless Initial Sync Experiment

5 min readFeb 12, 2020

Raw data and scripts for the experiment: https://github.com/mandrigin/ethereum-mainnet-resolver-witness-stats

***

One of the possible ideas to speed up the initial sync process it to use block witnesses to pre-build the cache trie to avoid slow state access. That will cost additional disk space and network bandwidth but potentially can significantly speed up the sync process.

How will that work? Basically, to run each block, we need some data in a Merkle trie. Before the block execution we already have some nodes there, but it might not be enough to run the block. Normally, this data will be taken from the state db and then added to the trie while the transactions are being executed. That might be slow because of the disk access/db lookups.

So we have 3 types of a flow here:

1)Normal flow (as currently in Ethereum nodes)

Before the block B is executed, we have a trie T1.
As we execute the block, we add missing bits and pieces to the trie, making it T1', T1'', etc. Every time we miss some information, we look it up from the database (slow).
After the block B is executed we have the trie T2 that has all the account states to run the block B.
We keep T2 for the future resolutions.

2)Stateless flow

Before the block B is executed, we don’t have a trie and we have a witness W to reconstruct the trie required to run this block.
We execute W and get the trie T2.
We execute the block B on T2, no DB lookups needed.
We throw away T2 after the block is executed.

3)Semi-stateless flow (this experiment)

Before the block B is executed, we have a trie T1 and witnesses W1, W2, … enough to convert T1 into T2.
We execute W1, W2, etc on T1, and get the trie T2. No DB lookups needed.
We execute the block B on T2, no DB lookups needed.
We keep T2 for the future resolutions.

Semi-stateless (3) flow on initial sync potentially gives most of the benefits† of the stateless flow (2), but should require less data to sync due to reusing trie cache.

† parallel block execution will be limited to some extent with semi-stateless approach.

So, to test the Semi-Stateless approach, we need to measure 2 things:

how much additional space/bandwidth is required for this approach; is it any better than the fully stateful approach?
how much faster does it make the initial sync.

In this article, we will focus on the disk space.

Setting up the experiment

Max size of the trie (Merkle trie): 1.000.000 nodes. When the number of nodes exceeds this value, the LRU nodes will be evicted to free up the memory. This way we can keep the used RAM under control.
The partial witnesses are stored in a db (our fork of boltdb). Each entry has the following structure:

key: [12]byte // block number + max number of nodes in the trie
value: []byte // witnessses, serialized as described in this doc

We don’t store the contract code in the witnesses (that is a limitation of the current architecture).

How the data was collected (required a synced turbo-geth).

(in the turbo-geth repository)make state./build/bin/state stateless \
     — chaindata ~/nvme1/mainnet/mainnet/geth/chaindata \ 
     — statefile semi_stateless.statefile \
     — snapshotInterval 1000000 \
     — snapshotFrom 10000000 \
     — statsfile new_witness.stats.compressed.2.csv \
     — witnessDbFile semi_stateless_witnesses.db \
     — statelessResolver \
     — triesize 1000000 \

Experiment Results

Total Storage

Witnesses DB (bolt db) to sync 6.169.246 blocks from scratch takes 99Gb

Quantile analysis

python quantile-analysis.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv

mean   0.038 MB
median 0.028 MB
p90    0.085 MB
p95    0.102 MB
p99    0.146 MB
max    2.350 MB

Full Data

python absolute_values_plot.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv

Witnesses sizes for blocks from 1 to 6100.000, capped at 1.0 MB. Sliding avg 1024.

Normalized Data (after DDoSes)

absolute_values_plot.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv 3000000

Witnesses sizes after DDoS values, sliding avg 1024.

DDos Zoom In

python ddos_zoom.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv

Zoomed in section to DDoSes influence on witnesses sizes (raw data).

We can see that due to the DDoSes around blocks 2.3M-2.5M and 2.65M-2.75M the sizes of witnesses are significantly bigger.

Full vs Semi

python full_vs_semi.py cache_1_000_000/semi_stateless_witnesses.db.stats.1.csv

Full Witness sizes are adjusted for missing codes components.

As we see from this chart, using the semi-stateless approach saves quite a lot of data if we compare it to the full stateless approach.

Conclusion

Having a stateless resolver adds around 0.4 MB additional information per block that needs to be transferred/stored. That is significantly less data than having a witness per block even when we adjust for code (you can see some charts in my previous post).

If the performance is good that can be a good mode for the initial sync to speed it up, but requiring less data than a fully stateful approach.