Looking back at the Ethereum 1x workshop 26–28.01.2019 (part 5)

This is continuation of the part 1 and part2 and part3 and part4

Problems with large (and growing) state

Failing snapshot sync

Described in part 1

Duration of snapshot sync

Described in part 1

Slower block sealing

Described in part 2

Slower processing of transactions reading from the state

Described in part 3

Block gas limit increase and the State fees (formerly known as State rent) share initial steps

Described in part 4

Stateless contract pattern is discouraged by the current gas schedule

The idea of stateless contracts was described in various places, including the first version of the State Rent proposal. I am extracting some pictures from there:

General idea of stateless contracts — storage off-chain, merkle roots on-chain

The main idea of a stateless contract is to have some fixed number of storage slots inside the contract (on-chain), and potentially unbounded number of off-chain storage slots. On-chain storage slots would store commitments (usually roots of Merkle trees) to the off-chain slots.

Transactions now need to have 2 extra types of elements — input proofs and output proofs

Whenever a transaction wishes to reference an off-chain storage slot, it would need to provide an “input proof”, which is usually the value referenced (blue rectangle in the picture above), with so-called “sibling” hashes leading up to the root of the Merkle tree. Stateless contract would verify the proof before it is to work with the referenced value. If a transaction wishes to modify the value of an off-chain storage slot (changing value from blue to orange in the picture above), it would need to provide an “output proof”, which is similar to the “input proof”, but constructed for the Merkle tree that appears after the modification.

Proof contention

If most transactions only ever reference the values of the off-chain storage slots, the stateless contracts work reasonably well. However, if there is a fair amount of modifications happening, the stateless contracts can experience what I call “proof contention”. If multiple users are trying to construct transactions modifying some storage slots (not necessarily the same slots, but those who share a Merkle tree), some of them might need to recalculate and resubmit their transactions (and perhaps many times). This is because whenever the Merkle root changes, it invalidates the proofs that were produced for the old root.

The problem of proof contention would not arise in the use cases where there is only one entity modifying the off-chain storage (perhaps on other users’s behalf). One example would be Plasma when the modifying entity is the Plasma operator.

The need for a sub-protocol for delivery of the off-chain data belonging to a stateless contract

Another interesting challenge for the stateless contracts is the need for a sub-protocol that is to deliver the off-chain data to the new users. Imagine you would like to start interacting with a stateless contract that has existed for some time and has a large state. You, as a user would either rely on an intermediary (like Plasma operator) to have that state and produce necessary proofs for you, or you would need to use some sub-protocol (which is currently not part of Ethereum protocol) to download the off-chain state first. This might be your only option if you think you might need this data to correctly execute your Plasma exit (something you do when you do not trust Plasma operator anymore and want to withdraw your assets).

Let’s pretend for a minute that these two challenges (proof contention and the need for a sub-protocol) are resolved to your satisfaction in some stateless contract, which is interesting to you. What are the gas costs of the interaction with such a stateless contract?

Whenever a transaction passes input data into the EVM, the sender is charged 68 gas per byte, or 2176 gas per 32-byte word. As Remco from the team 0x has shown to us during the workshop, if we assume an off-chain state which is has 2³² storage slots, and Merkle proofs being 32 words long, transmitting one proof would cost around 70k gas. Verification of the proof would take about 2k gas. If you transaction references an off-chain value and also modifies the same slot, you can reuse the same proof (because the siblings do not change). Contrast it with the modifying a word in a contract storage, which normally costs 5k. If the off-chain state is not that large, you would pay less. For example, if we put the entire Ethereum state (around 200 million items) into an off-chain slots, it would around 2²⁸, the cost of transaction data would still be 61k gas. For 1 million items, it would be 43k gas.

During the workshop, Remco proposed to:

  1. Cut the cost of transaction data from 2176 gas to 200 gas per 32-byte word. That would be around 7 or 8 gas per byte.
  2. Introduce a block size limit of 256 kB.
  3. More “aggressively” prune the transaction data

If we assume that most nodes joining the network are utilising the snapshot synchronisation mechanisms rather than downloading the block bodies and executing all transactions in them, the proposal (1) looks potentially reasonable. Obviously, we need to look into the potential DoS attacks this could enable.

Block size limit is probably based on the assumption that cheaper transaction data would cause the block bodies to become larger, which will in turn utilise more bandwidth of the nodes, and slow down the block propagation. These effects need to be studied by means of emulation and simulation. Perhaps there might be a way to tweak the current fork choice rule (inspired by Phantom, for example) to cope with extra bandwidth requirements yet without introducing a block size limit.

More aggressive pruning of the transaction data (in other words, downloaded block bodies) would be required again because the block bodies might get larger. This would be expanded on in the next part of the blog post series.

eWASM interpreters could be a sensible first change even though gas cost might not be practical in the beginning

Here I would like to simply reference the recent blog post by Guillame Balet, which goes into some detail of what was presented at the workshop. It will be followed by more posts and benchmarks that would inform us about the progress of eWASM within Ethereum 1x, and possible first steps that will be made towards the roll out of eWASM onto the existing Ethereum main net.

Chain pruning will become more relevant as we start constraining the state growth

To appear in the part 6 or later

Ethereum protocol changes do not need to take a year to be prepared

To appear in the part 6 or later