Don’t Get Forked: Best Practices for Handling Constantinople and Ethereum Client Upgrades

James Chen
Alchemy
Published in
5 min readJan 11, 2019

Software updates, whether to plug security flaws, fix bugs, or add new features, are generally a good thing for applications we use everyday, and we don’t think twice about applying them. But updates to core pieces of infrastructure can also be scary when they introduce breaking changes or unexpected behavior for dependent applications, or in other cases when failing to update on time might cause breakage. This issue affects entities small and large, and the blockchain is no exception.

At Alchemy, we provide reliable, scalable, and fast Ethereum infrastructure-as-a-service for our customers so they don’t have to struggle with maintaining nodes and can focus on building their applications. In the course of our work, we’ve learned many lessons (sometimes the hard way!) on how to best handle network forks and client updates. With the upcoming Constantinople hard fork, we thought this would be a great opportunity to share some of our best practices.

This is the first of many blog posts that we’ll be writing on running Ethereum infrastructure in a reliable, performant, and scalable manner.

Constantinople

Constantinople is Ethereum’s next system-wide upgrade and will implement five EIPs, which is actually part of a larger roadmap towards Ethereum 2.0. There are plenty of posts out there that discuss the specific changes, but the main thing to note is that as this is a non-contentious hard fork the community will adopt the new fork, making tokens and other state on non-updated nodes worthless.

How does this affect you?

For dapp developers, the key takeaway is that you’ll need to update your node to a Constantinople compatible version or it’ll be incompatible with the rest of the network past block 7,080,000. Ethereum clients will most likely be up to date on hard forks to the chain and release a stable, compatible version of their client well in advance of the estimated fork date e.g. geth and parity. If you are managing your own Ethereum client node, you would need to update your ethereum client to a version that is compatible with the hard fork; parity would be 2.1.10-stable or higher, and geth would be 1.8.20 or higher.

Hard fork updates should be few and far between, but Ethereum client updates can be much more common. If you are running a service dependent on the chain, keeping your client up to date can be beneficial to fix security flaws or leverage new functionality. However, you must be careful that you don’t introduce unexpected issues into your application.

Best Practices

If you are running a service that requires regular interaction with the Ethereum blockchain, a couple of core best practices will save you from potential catastrophic bugs when you update your nodes. We’ll go over some real-life scenarios of updates gone badly that were the impetus for this blog post.

Running Multiple Nodes

We strongly recommend running multiple nodes as a general best practice for maintaining in-house Ethereum infrastructure because of the many benefits, including increased reliability and scalability. However, this can introduce consistency and idempotency issues into your system, which we will tackle in a subsequent blog post.

When it comes to updates, running multiple nodes provides a few additional benefits:

  1. Extra nodes serve as backups in case an update goes wrong and renders a node unrecoverable
  2. High availability and no downtime while updating nodes in a rolling fashion
  3. Allows regression testing that compares updated nodes to original nodes to ensure expected behavior

In the scenario of an unrecoverable node, a new full archive node can take up to two weeks to fully sync. Without a backup, this is an unacceptable amount of downtime.

Example 1: Buggy DB Migration

One issue we saw when upgrading parity from 1.X.X to 2.X.X was corruption in the database. The update included an internal DB migration which was incorrectly converting block numbers in the bloom filter which resulted in empty results for getLog requests for historical blocks. The loss of data was irreversible so the node could not be salvaged. We were fortunate enough to catch it through internal testing on a canary node, fallback to our non-updated nodes to keep our systems up, and wait until the issue was resolved by the parity team in 2.2.3-beta.

Rigorous Regression Testing

Rigorous testing is always strongly recommended for any changes to your system. In that vein, when updating your nodes, you’ll want to test to ensure that your application and infrastructure continue to work as expected. Doing otherwise would potentially expose your users and your systems to outages. With multiple running nodes, the best way to guarantee this is testing your previous stable nodes against an updated canary node. If any breaking changes are detected, ensure your systems can safely handle them before updating the rest of your cluster.

As for the regression tests, we go straight to the source by replaying production requests against our nodes. This can be done manually for one off checks, but because we support so many projects, we’ve built an extensive automated testing framework that samples recent live requests, replays them on different client versions, and compares the results. This allows us to verify that our code can safely interact with updated nodes before shipping to production. This framework can be leveraged for regression tests on internal code changes as well.

Example 2: Response Output Format Change

This was an interesting issue we came across as a newly updated client returned a different response on an EVM execution error. Parity 1.11.1-beta returned a response that looked like this:

{“jsonrpc”:”2.0",”error”:{“code”:-32015,”message”:”VM execution error.”,”data”:”Reverted 0x”},”id”:1}

Instead of responses from previous versions that looked like this:

{“jsonrpc”:”2.0",”id”:1,”result”:”0x”}

The response format change is actually beneficial as it bubbles up more execution details to the caller rather than silently failing, but could cause issues if the code was not set up to handle it. In this scenario, comparing response outputs caught the change and we simply updated our systems and alerted our customers so that they were not alarmed by the new response, and could update their services to be compatible.

Build On!

We hope that this has been helpful, and that these best practices will make node updates painless for you in the future. They’ve helped us ensure that updates always go smoothly both for our own systems and for the customers that rely on us. We know how difficult the intricacies of maintaining and scaling nodes can be, and we want to spread the learnings we’ve had over the years.

If you are working on an Ethereum project and don’t want to deal with the overhead of maintaining nodes, talk to us! Alchemy provides the fastest, most scalable, and most reliable Ethereum infrastructure as a service so that you can focus on building your product. Under the hood, Alchemy has built revolutionary new infrastructure that already powers top blockchain projects like Augur, top blockchain companies like CryptoKitties, and top hedge funds (managing more than $3 billion). Find out more at alchemyapi.io.

Sign up for a free account. Check out our documentation. Visit our website. For the latest news, follow us on Twitter.

--

--