Looking back at the Ethereum 1x workshop 26–28.01.2019 (final part 6)

This is continuation of the part 1 and part 2 and part 3 and part 4 and part 5 and the final part of this series.

Problems with large (and growing) state

Failing snapshot sync

Described in part 1

Duration of snapshot sync

Described in part 1

Slower block sealing

Described in part 2

Slower processing of transactions reading from the state

Described in part 3

Block gas limit increase and the State fees (formerly known as State rent) share initial steps

Described in part 4

Stateless contract pattern is discouraged by the current gas schedule

Described in part 5

eWASM interpreters could be a sensible first change even though gas cost might not be practical in the beginning

Described in part 5

Chain pruning will become more relevant as we start constraining the state growth

Chain pruning deals with the data that Ethereum nodes keep on the disk, other than current state and state history.

Data transformations that happen during block execution: state + transactions => new state + receipts

On the diagram above, we see main data transformations that happen during the block executions in Ethereum. Usually, the transactions taken from the block bodies are executed in the context of the state, and that results in the new, modified state, and also generates receipts (those receipts contain logs). This has correspondence in the yellow paper too, the relevant extract is shown at the end of this section.

Chain pruning is not about the “current state” (this is subject of previous parts) and “historical states” (this is currently pruned or written only sparsely already), but about block headers, block bodies (with transactions in them), and receipts (with event logs in them). Péter Szilágyi has written the first chain pruning document back in November 2018, which is here. At the workshop, Fredrik Harrysson has also made presentation on the topic, where he highlighted that a better understanding of the actual requirements and possible tradeoffs is needed.

Depending on what an Ethereum node is used for, it may have different requirements to what data needs to be kept around and for how long. If the required data are missing, there are some operatives in the eth/63 protocol to fetch them from the peers:

  1. For headers, [Get]BlockHeaders
  2. For block bodies (with transactions in them), [Get]BlockBodies
  3. For current and historical state, [Get]NodeData, though this does not work in Turbo-Geth and is likely to be extended once more advanced sync protocols are developed. Also, [Get]ContractCode.
  4. For receipts, [Get]Receipts

However, and this has also been noted in Frederik’s presentation, if peers made very diverse decisions about what data to keep around and for how long, it might take a while to find a peer that has just the information you are missing. The hope is in the new discovery protocols, like “Discovery 5”, which has been in development for few years. Regarding discovery, Antoine Toulme has made an impromptu presentation and explained some of the challenges. For example, if discovery allows you to be very “picky” about your peers, there is a risk of creating “islands” in the network, around the peers that possess certain rare information.

As discussed in the part 4 of this series, increased use of Stateless Contract pattern would lead to the increased size of transactions, and therefore, accelerated growth of the “block bodies” component of the chain.

State transition function from the Yellow paper

Ethereum protocol changes do not need to take a year to be prepared

We had a brief retrospective on why the changes in the latest Ethereum upgrades, Byzantium and Constantinople, seemingly took a long time to prepare. Since we did not have a lot of core developers present, this retrospective definitely lacked some useful perspective, but I will put out here what came out of it. A lot of what is written below is my personal perspective too.

One apparent reason for “low change bandwidth” is the fact that there was no reason to rush. Once changes were specified and decision was made to include them, implementation of the entire Constantinople took probably couple of weeks, no more. Everyone knew approximately when the difficulty bomb would start slowing the blocks down. Other changes that were planned did not have certain deadline and there were not strictly speaking essential, more like “nice to haves”.

In Byzantium, a lot of time and effort was spent on precompiles. Fuzz tests found lots of bugs.

Third apparent reason was the fact that review of the changes happened very slowly. A lot of times people present on the Core Dev Calls were not prepared to discuss the proposed changes, so discussion had to be postponed (each time for 2 more weeks).

So what can be done to increase the “bandwidth” for Ethereum 1.x changes? Here are my ideas (would be curious to read more of them):

  1. Only essential changes (those that need to be done ASAP based on reasonable projections, or those being on the critical path to the former) are to be proposed.
  2. Specific precompiles are to be handled generically within the eWASM working group (meta-features vs point-features).
  3. Even if we continue to use Core Dev Calls as a change review venue, most discussions should happen prior to the call. To make it work better, for each change, we should appoint a reviewer (or couple) who will either review changes themselves, or prod other reviewers to do so. It is a good idea to start growing the number of change reviewers, even from the people who are not necessarily on the calls currently, but can be invited.