Ethereum Classic Post-Fork Turbulence

IA
7 min readJun 4, 2020

--

Block 10_500_839 ushered in another new era for Ethereum Classic, enabling the Phoenix upgrade on mainnet. With the acceptance and implementation of ECIP1088, the network finally began a chapter of EVM protocol parity with it's younger sister, Ethereum.

Protocol Upgrade

ECIP1088 introduced a set of six distinct improvement proposals, introducing a new opcode, a precompiled contract, and modifying gas costs for several existing opcodes. For those watching the fork in real time as it happened on Sunday, or Monday as it were, at 23:58 UTC, the transition appeared relatively stable. Contracts were deployed within 20 minutes validating the behavior of the new and modified EVM.

Six hours after the fork was declared a success on the community call, and a great many of us went off to rest, it became apparent that the formerly glass-like surface of ETC was not to remain. A network fragmentation presented as network providers formed cliques, with each reluctant to treat the other as compatible. Core-Geth and Multi-Geth formed one, OpenEthereum another, and Hyperledger Besu the third.

The network avoided complete partitioning thanks to providers which had not upgraded to the eth/64 protocol, and peers which maintained existing relationships without resorting to the discovery protocol which comes before block exchange in the peer-to-peer negotiations. The hashrate dropped slightly, although staying within a normal range, and block uncle rates were not noticeably impacted.

Fig. 1: Blocks Uncle Rate and Network Hashrate. Times shown in local time UTC-5, fork local time was 18:58 UTC-5.

Fragmentation Analysis

This fragmentation was caused by an incompatibility in the eth/64 protocol as implemented by these providers, with Core-Geth and Multi-Geth sending 9007bfcc as their forkid hash, OpenEthereum and Besu sending something else. Core-Geth and Multi-Geth clients meeting others were sent incorrect forkids, and in doing so were given cause to treat those peers as incompatible, reluctant to escalate the peering relationship to an exchange of block data.

The EIP 2634 “eth/64: Forkid-extended protocol handshake” and its related EIP 2124 “Fork identifier for chain compatibility checks” describe the Ethereum peering protocol upgrade eth/64, enabling nodes to more efficiently discover relevant and compatible peers. The forkid value communicates an identifier derived from and representing the provider's ultimate chain progress and upgrade configuration, which we can understand as a set of configured fork numbers and local chain head.

Providers broadcasting forkids representing incompatible configurations, e.g., different chains, forks, and sync progresses, can be dropped by a discovering or discovered provider as incompatible, and in doing so, spare the provider the time and energy of attempting further peer-to-peer block negotiations with that ineligible peer.

The OpenEthereum client was patched and a release published Monday at 17:41 UTC circa 18 hours after the fork. The Besu client was patched and a release published Tuesday at 23:17 UTC circa 47 hours after the fork. The network has since stabilized.

The OpenEthereum bug was limited to the v3.0.0 release, which enabled eth/64 as the default. Prior releases using eth/63, including Parity Ethereum v2.7.2, were unaffected.

Why wasn’t this caught on the testnets?

Insensitive Chain Configurations: The forkid values are derivations of chain configuration, and are functions of firstly a set of fork block numbers, and secondly a chain height. Mordor and Kotti testnets use chain configurations that are not directly comparable to Ethereum Classic. Mordor is configured with the Ethereum Classic protocol upgrades through Byzantium as the genesis configuration, causing it to have only seen two forks in its lifetime rather than Classic's now seven forks. Kotti has a similarly abbreviated and incomparable configuration.

Non-v3.0.0 clients: Any OpenEthereum clients that were not upgraded to the latest v3.0.0 on the testnets within in the window of three weeks prior to the mainnet hardfork were unaffected. As with the mainnet, the case would have only presented itself prominently if a critical mass of clients at v3.0.0 were achieved, which does not appear to be the case on either net.

For both networks, both explanations were at play. The feature itself being a soft edge case, it is difficult to disentangle them.

On Mordor: Incongruous fork sets allowed OpenEthereum’s algorithm to produce correct results in the limited fork set, but an incorrect one in production. Specifically, the OpenEthereum bug was caused by a failure to install the following forks as as eligible for the forkid parameter set:

ethash.params.difficulty_hardfork_transition, ethash.params.bomb_defuse_transition, ethash.params.eip100b_transition, ethash.params.ecip1010_pause_transition, ethash.params.ecip1010_continue_transition, ethash.params.ecip1017_era_rounds, ethash.params.expip2_transition,

openethereum#11747/files#diff-9a0d4c3f753d90ed4b2dd57d597ee2c1R431-R439

While the ETC mainnet has transitioned through ECIP1010, ECIP1017, and Difficulty Bomb Disposal (ECIP1041), Mordor and Kotti squashed these transitions to the genesis block. Ergo, the fix — including these forks to be forkid-eligible — does not modify client behavior on Mordor.

On Kotti: Like Mordor, Kotti’s configuration is insensitive to the mainnet fix. However, unlike Mordor, Kotti does not meet spec before nor after the fix. It was “broken,” and it’s state, being insensitive to the fix, is unchanged.

Differing from Mordor testnet and Ethereum Classic mainnet, which use the Ethash Proof-of-Work algorithm, Kotti uses a Proof-of-Authority consensus algorithm Clique, congruent to the Ethereum Foundation’s Goerli testnet.

The relevant incorrectly-omitted forks for forkid parameterization noted above are specific to Ethash, rendering the Kotti configuration insensitive to their absence (or presence) in the general case.

Why wasn’t OpenEthereum v3.0.0 broken before the fork?

It was broken. The v3.0.0 release was published about 20 days before the mainnet fork. Clients that upgraded before the fork may have noticed peering issues.

However, one reason we observed this issue on mainnet only after the fork was a significant share of Classic Geth clients breaking away which previously served as eth/63 bridge. The Classic Geth client reached end-of-life in January earlier this year and did not implement the Phoenix upgrade.

Until a critical number of clients were upgraded to the affected version and as long as Classic Geth was part of the peer list, it may be reasoned that the eth/63 protocol would have been pervasive enough to hide the eth/64 issue by causing the v3.0.0 client to use eth/63 with the majority of peers rather than its faulty eth/64.

Isn’t the ForkID protocol backwards compatible?

It is. The protocol is opt-in. But for clients that do opt-in, the protocol is designed to differentiate incompatible nodes, so if a provider opts in but provides an incompatible value, it will be treated an incompatible node.

Older nodes which have not opted in can serve as a bridge between these mismatched newer nodes by restricting discovery protocols to an earlier and still-supported version, e.g., eth/63, without the forkid component.

Ongoing challenges for protocol engineering

The discrepancy we saw on Ethereum Classic exemplifies the challenge and importance of coordinating not only fork but also non-fork protocol upgrades across a peer-to-peer and distributed, only-marginally governed network. Cross-client testing is one of the major challenges developers and coordinators face, and soft edge cases like this are arguably the trickiest, since they won’t be pointed to in a prominent countdown-style meta-fork specification. They won’t have standardized tests, and they will fall in the no-man’s land of testing responsibility, into the lap of the developers responsible for the entire network.

From this turbulent ride we hope we can gain the rare wisdom of motivation to open collaboration, constructive questions and creative experiments, and, like our blockchain, append-only this experience in the shape of growth.

Resources

- https://eips.ethereum.org/EIPS/eip-2124 (ForkID algorithm)
- https://eips.ethereum.org/EIPS/eip-2364 (ForkID protocol)
- https://ecips.ethereumclassic.org/ECIPs/ecip-1082 (ForkID caveats for ETC)
- https://ecips.ethereumclassic.org/ECIPs/ecip-1088 (“Phoenix” Meta fork specification)
- https://ecips.ethereumclassic.org/ECIPs/ecip-1010 (Difficulty Bomb Delay)
- https://ecips.ethereumclassic.org/ECIPs/ecip-1017 (Monetary Policy)
- https://ecips.ethereumclassic.org/ECIPs/ecip-1041 (Difficulty Bomb Disposal)
- https://github.com/eth-classic/mordor (Mordor chain-spec)
- https://github.com/goerli/testnet (Kotti chain-spec)

Appendix

To test the hypothesis presented above under Why wasn’t this caught on the testnets?, we can run a very simple test as follows for Mordor.

The following diff reverts the bug fix and installs a test for the correct expected forkid parameters for Mordor. The passing test demonstrates that the fix was ineffectual, and that the testnet configuration was not sensitive to the forkid bug.

> git --no-pager diff diff --git a/ethcore/spec/src/spec.rs b/ethcore/spec/src/spec.rs index b65f8c5dd..aef91c227 100644 --- a/ethcore/spec/src/spec.rs +++ b/ethcore/spec/src/spec.rs @@ -430,13 +430,13 @@ impl Spec { for block in &[ ethash.params.homestead_transition, ethash.params.dao_hardfork_transition, - ethash.params.difficulty_hardfork_transition, - ethash.params.bomb_defuse_transition, - ethash.params.eip100b_transition, - ethash.params.ecip1010_pause_transition, - ethash.params.ecip1010_continue_transition, - ethash.params.ecip1017_era_rounds, - ethash.params.expip2_transition, + // ethash.params.difficulty_hardfork_transition, + // ethash.params.bomb_defuse_transition, + // ethash.params.eip100b_transition, + // ethash.params.ecip1010_pause_transition, + // ethash.params.ecip1010_continue_transition, + // ethash.params.ecip1017_era_rounds, + // ethash.params.expip2_transition, ] { if let Some(block) = *block { hard_forks.insert(block.into()); diff --git a/ethcore/sync/src/chain/fork_filter.rs b/ethcore/sync/src/chain/fork_filter.rs index 63ed22cb4..e12bd4590 100644 --- a/ethcore/sync/src/chain/fork_filter.rs +++ b/ethcore/sync/src/chain/fork_filter.rs @@ -157,4 +157,14 @@ mod tests { ], ) } + + fn mordor_spec() { + test_spec( + || spec::new_mordor(&String::new()), + vec![ + 301_243, + 999_983, + ], + ) + } } > cargo test --package ethcore-sync --lib chain::fork_filter::tests::mordor_spec -- --exact Finished test [unoptimized + debuginfo] target(s) in 0.22s Running target/debug/deps/ethcore_sync-f867f27d25255cb1 running 1 test test chain::fork_filter::tests::mordor_spec ... ok test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 81 filtered out

--

--