Looking back at the Ethereum 1x workshop 26–28.01.2019 (part 2)

This is continuation of the part 1

Problems with large (and growing) state

Failing snapshot sync

Described in part 1

Duration of snapshot sync

Described in part 1

Slower block sealing

For the purpose of this post, I have refreshed the data about the occupancy of the account trie (and now the storage tries) with the very recent block, 7'173'251. I am glad to say that the process of calculating this with Turbo-Geth database is now quite efficient: 4 minutes for accounts trie and 18 minutes for storage tries.

Number of accounts: 52'799'322, number of trie nodes: 72'601'839
Number of storage items: 165'287'953, number of trie nodes: 222'821'353

The second table is aggregating together storage tries of all existing contracts, that it why it does not have such a predictable structure as the first one, which represent a single trie of accounts (both contracts and non-contracts).

The average depth of a node in the account trie is 7.89. Given that the levels 0 to 4 (or even 0 to 5) of the trie can be permanently cached in memory, we can say that the average access depth is 3.89 or 2.89. The average depth of a node in the storage tries is 5.38. It can be impractical to permanently cache even the first level of the storage tries in memory, so we assume that caching will not have a big effect on the depth there.

Now, if transactions in a block modify the state, then at the end of the block the new root hash of the state needs to be updated. Each modification requires recomputing the hash on every level of the trie. In turn, recomputing of the hash requires fetching the “sibling” hashes. On the picture below, the recalculated hashes are shown in green, and “sibling” hashes are shown in grey:

It is clear that the deeper the modified entries are in the state trie, the more sibling hashes need to be fetched. Each row of 16 hashes is one so-called branch node, and it can be fetched with 1 random database read in most Ethereum clients. Therefore, we can say that number of database reads is currently on the order of 3 (2.89 rounded) or 4 (3.89 rounded) for one account modification, and 5 or 6 (5.38 rounded) for one storage item modification. These numbers will keep growing, but not linearly (rather logarithmically) with the size of the state. It is possible for a powerful adversary to push these some of these numbers for some type of transactions.

Most SSTORE operations that are happening currently operate on the items that have already been accessed by another SSTORE or SLOAD within the same block, therefore no extra fetching is required in such cases. This could be explained by the way solidity compiler deals with types such as address or bool, which do not take an entire 32-byte word. When solidity code updates values of such types, the compiler generates the code that tries to protect the existing bits that do not belong to the type. For example, this code:

pragma solidity ^0.5.0;
contract Bool {
bool b;
constructor() public {
b = true;
}
}

generates this snippet for b=true:

/* "Bool.sol":79:87  b = true */
dup1
sload
not(0xff)
and
/* "Bool.sol":83:87 true */
0x1
/* "Bool.sol":79:87 b = true */
or
swap1
sstore

I am currently running an analysis that retraces all transactions and counts how many SSTOREs are “naked” (modifying items not previously accessed within the block) and the total number of SSTOREs.

Even though most of SSTOREs access items that do not need extra fetching at the end of the block, this can change. For example, solidity compiler can learn to optimise that behaviour away.

In conclusion, we can say that number of database reads necessary for the block sealing is proportional to N1*4 for account modifications (N1 is number of “naked” successful account modifications in the block), or N2*6 for storage item modifications (N2 is number of “naked” successful SSTORE operations in the block). Each database read likely takes time proportional to log(Sdb), where Sdb is the size of the state database. In total, the complexity of block sealing due to the state size (and indirectly state size database) is

4*N1*log(Sdb) for account modifications
6*N2*log(Sdb) for storage item modifications

What happens when block sealing is slow?

  1. Block sealing needs to be performed both by the miner who creates the block and the miner who wishes to mine on top of that block. The longer it takes, the more advantage is given to the creator of the block. It is kind of inadvertent selfish mining (please correct me if I am using terminology too liberally here).
  2. If verification of blocks takes bigger and bigger fraction of inter-block time (currently it is around 300 ms / 15 s = 2%), Ethereum nodes will become less able to keep up with the tip of the chain.
  3. If it takes a long time to seal a block, more mining pools will adopt the strategy of “mission blocks”, described here, look for “F2Pool explains its empty blocks”. In short, when miner received a new block from outside, they will quickly construct an empty block (aka “mission block”) and start mining it, while at the same time trying to construct a full block. If they find correct nonce before the full block is constructed, they just go ahead and publish empty block. If not, they switch to mining the full block. As the sealing time grows, it makes less and less sense NOT to use this strategy.

Further investigations:

  1. Collect and publish data on “naked” SSTOREs.
  2. Measure the disparity of block sealing time in mining mode and verification mode
  3. Rerun the benchmarks used for calculate the gas cost of SLOAD and SSTORE and see how much underpriced they are, possible improve the benchmark
  4. Analyse potential for concurrent execution of transactions to amortise the latency of state access.

Possible mitigations:

  1. Optimise block sealing for miners, if disparity of block sealing in mining mode and verification mode is significant
  2. Separate current state from historical state in the state database, to reduce the log(Sdb), this is already done in Turbo-Geth
  3. Pre-warm the trie cache based on the transaction pool data
  4. Raise the gas cost of SLOAD (this will not lead to hoarding as the raising cost of SSTORE would)

Solutions:

Reduction of the state size, or at least the rate of its growth, should help limiting the problem, and allow the client software optimisations (such as advanced sync mechanisms) to catch up.

Slower processing of transactions reading from the state

To appear in the part 3 or later

Block gas limit increase and the State fees (formerly known as State rent) share initial steps

To appear in the part 3 or later

Stateless contract pattern is discouraged by the current gas schedule

To appear in the part 3 or later

eWASM interpreters could be a sensible first change even though gas cost might not be practical in the beginning

To appear in the part 3 or later

Chain pruning will become more relevant as we start constraining the state growth

To appear in the part 3 or later

Ethereum protocol changes do not need to take a year to be prepared

To appear in the part 3 or later