ETC’s Chainsplit: What Went Wrong

Austin Roberts
Rivet Magazine
Published in
4 min readAug 2, 2020

At 2:00 this morning, Rivet’s alarms started going off. This time it wasn’t a problem with Ethereum, but Ethereum Classic. Ethereum classic is hardly our bread and butter; it’s a smaller chain with less traffic, but if Rivet supports a chain we support that chain even at 2:00 AM.

At first, it appeared that the “problem” was simply that there weren’t many blocks being mined. That’s not terribly unusual. The hashing power is not especially high, and one big miner going offline can decrease the blog frequency significantly. But when we checked Blockscout, it looked like we were a few hundred blocks behind — but we were still processing new blocks, and the block timestamps were recent, so something was really out of whack.

Looking at server logs:

WARN [08–01|06:53:58.627] Large chain reorg detected number=10904146 hash=3f2023…c7d22a drop=3693 dropfrom=e900d8…8c99d0 add=3291 addfrom=caa90c…bddcf7

Oof.

A chain reorg of over 3,000 blocks is a bad sign. Really bad.

Checking the peers for our ETC nodes, we found that all of our peers were running either ETC Labs’ core-geth or the MultiGeth branch. We had zero peers running OpenEthereum or OpenETC, which is atypical. From past conversations with Igor from POA (who run Blockscout), I knew that Blockscout relied mostly on OpenEthereum nodes. Similarly, the ETC Cooperative’s endpoint at www.ethercluster.com/etc was running on OpenEthereum nodes, and it matched Blockscout. The chain split reflected a difference between clients, which is bad.

So at this point it’s 3:00 AM. The problem clearly isn’t with our servers — and it’s someone else’s job to figure out what’s causing the chain split, so we should just go back to bed and await instruction, right? It was tempting, but when Rivet supports a network, we really support the network.

We found our way to the ETC Discord to see what was going on. We got there to find that we had already figured out more than the discord channel. They hadn’t figured out the magnitude of the chainsplit, or that it seemed to be divided between clients.

Continuing our investigation, we found that the chain running on Geth clients had some interesting properties:

  • For a bit over 3,000 blocks starting at 10,904,147, a single miner mined every single block.
  • Within those 3,000 blocks, only five transactions were mined.
  • Within those 3,000 blocks, there were many blocks that had uncles mine by the same miner as the parent blocks.
  • They appeared to all have been dumped on the network at once; it didn’t look like this had been a competing chain for the ~12 hours of mining.

Meanwhile the chain running on OpenEthereum clients had a diversity of miners and several hundred transactions.

The strangest part about the disparity between the two chains was that the Geth chain had a higher total difficulty, while the OpenEthereum chain was longer. This is a rather unusual combination — usually length and total difficulty are pretty close to each other.

So by the rules of the blockchain, Geth has the right chain, but OpenEthereum’s chain is the one that most of the network had been following for the past 14 hours. It was possible to get Geth to align with the OpenEthereum chain with the flag:

--whitelist=10904147=0xde7660f6c700d376dce96823e31073fa74448bfd9e4731f86d00139a0722aaae

And at the time, there wasn’t a clear way to get OpenEthereum onto Geth’s chain. Since it wasn’t clear that the clients on different chains would ever reconcile naturally, we started encouraging Geth users to include that flag. But around the same time, some additional miners came online on the Geth chain, causing that chain to overcome the OpenEthereum chain in both difficulty and length. At that point, very few miners were still mining the OpenEthereum chain, and the Geth chain became the dominant chain. As our main interest was making sure there was a clear canonical chain, we were happy with this, and stopped pushing the whitelist to put people on the other chain.

Our team circled around with Yaz from the ETC Cooperative to develop the Chain Split Diagnosis document, which was the first detailed analysis made available to the larger community.

There were a bunch of things that went wrong:

  • First: Ethereum Classic’s hashing power is low enough to be subject to a 51% attack. That’s not necessarily fatal for the chain, but people should be very careful about what they entrust to the chain.
  • Second: The 51% attack won, and 12 hours of transactions were reverted. There was a moment when it would have been possible for miners to reject the 3,000 empty blocks and continue mining on their old chain. It wouldn’t have met the definition of “correct” according to the consensus layer, but if miners had stayed focused on the original chain they may have been able to make it the heavier chain.
  • Third: The split between Geth and OpenEthereum is very troubling. It’s being largely dismissed by the ETC community as OpenEthereum has dropped support for ETC, but as Rivet actively particpates in both the ETC and ETH communities, this is worrying. While a 51% attack on ETH would be far more expensive because of the hashing power involved, having Geth and OpenEthereum end up on a chainsplit could still be a real problem. We need to diagnose why OpenEthereum went with the longer chain instead of the harder chain, and make sure it’s not something that could happen on the Ethereum mainnet that OpenEthereum still supports.

So while ETC’s problem is a lack of hashing power, the client differences that lead to a chain split instead of a clean (but large) reorg may carry over to the larger chain.

--

--