How we deal with chain reorganization at EthVigil

Anomit Ghosh
Jun 15, 2018 · 6 min read

After the on a 1000 ft view of the approach to enabling blockchain applications, we will be touching on an aspect of the Ethereum network (and, quite possibly other similar singleton state machine based blockchain implementations relying on proof-of-work) that rarely gets discussed from the perspective of real-world use cases. It is the topic of chain reorganization. We believe any business worth its salt would do good to be aware of its implications and plan/deploy mitigative measures against it.

What exactly is a chain reorganization in Ethereum?

  • The client software (Geth/Parity etc.) running as the node participates in the consensus algorithm unique to Ethereum: Greedy Heaviest Observed Sub Tree (GHOST) protocol.
  • It accepts a certain chain of blocks as the “truth” at a given point of time after participating with peer nodes in the consensus that communicate via the.
  • The currently accepted truth is the above chain among the peers that have synchronized so far.
  • It happens so that a few of the peers discover a different version of the chain from a different set of peers that complies with the GHOST protocol and is a stronger case of the truth
There is a change at block number 623. The hash has changed and so have the subsequent children blocks.
  • Now the nodes are faced with a fork in the version of the truth. This is purely a high level view, please note that the actual codebase of the node softwares don’t exactly maintain a fork’s data structure like this.
  • After further synchronizing among the other peers who accepted the green version as the earlier truth, consensus is reached that since the red fork has _more computation_ done on it [2], it should be accepted as the current canonical truth.
  • And then there was peace. Or was there? Consider the following:
    • Your dApp code naively pulled transactions through web3 libraries(or lower level JSONRPC) from the outdated work on block #623 with block hash 0xblockhash4 before the fork was accepted.
    • It found certain emitted events integral to your business flow: for example, newShareHodlerAdd(address,uint32), shareHodlerApproved(address,bool).
    • It proceeded to extract the corresponding event logs and add it to a centralized, secure database to finalize the list of approved shareholders along with references to the transaction hash that is a proof of the finalization existing on blockchain.

Once a block or blocks, (those with hashes 0xblockhash4 and 0xblockhash5 in this example) are dropped, the Ethereum State Machine reverts the transactions that were applied so far from those blocks. The work specified in those transactions may or may not exist any more in the newly accepted version of truth. In case they don’t, the dApp is now left with a state inconsistent with the globally accepted version of the Ethereum State Machine and consequently, the final chain of blocks.

PHEW!

What do we learn from this?

  • At present, block finality is probabilistic in Ethereum. To quote from the linked article[2] in the footnotes by Gavin Wood on chain reorganization in Ethereum,

Because it takes time for the blocks to percolate through the network, it’s easy for different parts of the network to have a different final block (or two, or perhaps even three) in normal operation since the miners often come up with them at roughly the same time. This is what we might call ephemeral forking.

  • Which means as a serious business case on the blockchain, the work has to be put in for either — (a) redundancy in data access by maintaining caches, listen for transaction retrieval failures and double/triple/n-tuple check on self-maintained nodes, retrospective measures when inconsistent data is found (b) run on a safe block delay of 12, for example, so that the application has the least chances of encountering a reorg at such a depth

Approach (a) would terribly increase the complexity and take away focus of development efforts from the real business details. While (b) would be undesirable in cases where an experience as close to real time is of utmost importance.

EthVigil is a State Machine Gateway to Ethereum, and not just a mere relay

on the needs that EthVigil fulfills,

  • you work with familiar REST API endpoints that are generated according to a smart contract’s interface — GET’s for reads from the blockchain, POST’s for writes to it.
  • The data that you fetch, the events you subscribe to, the webhooks you integrate are all handled through our transparent layer.
  • The fault tolerance, redundancy and antifragility we promise for dApps built on us have been through several iterations now where we don’t merely cache or do basic validity and integrity checks on transactional data, but we have a state machine itself working as the API gateway.

The only honest way to provide a reliable layer for businesses and real life, critical use cases to be on the blockchain was to respect the design principles of blockchain, specifically Ethereum in this case.

This is a rough architecture of what goes on behind the scenes in the core of EthVigil API gateway. This is under refinement and improvement quite literally every day as we encounter new edge cases and feedback from testing our service on live data round the clock.


It has been quite a while since we posted material on our engineering blog.

But rest assured, we have been insanely hard at work pulling off functionally relevant features for applications on blockchain that serious consumers and businesses can rely on —

  • support for stateless contracts
  • one-off transaction monitoring
  • integrations like webhooks/email that can be extended for further use cases like Slack notifications, Zapier integrations combined with event and transaction monitoring/data parsing on contracts/externally owned accounts
  • integrations for notifying chain reorgs, snapshot of historical data during a fork, dropped transactions, inadvertently sent out event logs, duplicate event logs

…and many more. We will cover these features in quite a few posts coming up in the following months as we showcase applications powered by EthVigil accompanied by extensive documentation and video walkthroughs.

No hype. No BS. Only code that works.

Sign up on to get early access to our product.


References, further reading and more knowledge to dig into:

[1]

[2]

[3] Ethereum StackExchange:

[4] Not to confuse Proof-of-Work in Ethereum with the actual consensus algorithm that is GHOST

[5] From the OpenZeppelin blog, why do we still need a traditional server-client model to power dApps:

Now, a critical question that you need to answer is why do you need a server at all
[…]
there are still many uses for a server backing your app. First and foremost, on-chain code cannot directly work with off-chain services. This means that if you want to integrate with third party services, inject external data like USD/ETH rates, or perform basic actions such as sending an email, you need a server to take care of this.

The server can also act as as cache or indexing engine for your smart contracts. While the ultimate source of truth is the blockchain, your clients may rely on the server for search capabilities, and validate the returned data on-chain.

BlockVigil

The API Gateway for Blockchain

Anomit Ghosh

Written by

CTO, Co-Founder and Blockchain Engineering at BlockVigil. Athlete. Artist. Polymath. Autodidact.

BlockVigil

The API Gateway for Blockchain