Train, Planes & Network Upgrades
A regular release cadence for Ethereum 1.x
This post is a rough transcript of the talk that Danno Ferrin and I gave at Devcon V. It discusses some of the EIPs process improvements suggested by the community over the past year and combines them into a single framework for how we can do Ethereum Network Upgrades more smoothly.
We’ve named it the Train Station Model.
We propose a new way of coordinating Ethereum Network Upgrades, based on several suggestions by the community over the past year. Specifically, we think that:
- Upgrades should happen twice per year, to provide predictability to the community;
- Upgrades should only contain EIPs that are ready to ship, and anything still in progress should be moved to the next upgrade rather than delaying the current one;
- EIPs should be developed by Working Groups in an independent way and should only be considered for inclusion in an upgrade once they are done;
- EIPs & Working Groups should have a Champion that is the main point of contact for the community regarding that EIP.
If you’d like to discuss this proposal further, please head to the EthMagicians thread 🧙🏻♂️
Let us start with a little bit of “Ethereum Network Upgrade history”. The first handful of network upgrades, Frontier, Homestead, Byzantium and Constantinople, were somewhat like 1950’s family vacations: mom and dad would pack the car, we’d hop in the back, maybe bring our favorite toy along, and they would get us to destination safely.
This is a simplified retelling, but, for those upgrades, core developers wrote most of the EIPs and the Ethereum community was small enough that they were socialized to the right people somewhat organically and forks happened smoothly.
…. except when they didn’t! A few times, we did have to fight fires. On those occasions, it was all hands on deck. Be it the Shanghai Attacks, the DAO hack, or the last minute vulnerability discovered in Constantinople, the community always came together to propose changes, implement them and get users to upgrade their nodes in a timely manner.
And then came Istanbul. By that point, we had a good grasp of our process and decided to plan things more thoroughly. We set ourselves some deadlines for each stage of the upgrade process, from the EIPs submissions all the way to the mainnet upgrade. Before we knew it, we were looking at an Ethereum-waterfall process!
As we know, waterfall doesn’t work for software development. But, we went with it anyways for Istanbul.
We had planned to have a kickoff in January, take several months to review EIPs and have a deadline for Acceptance mid-May, then take two months for client implementations, which would bring us to mid-July. With all this done, we could have our testnet upgrades around mid-August, and go live on mainnet mid-October… two weeks before when we thought devcon would happen.
So, how many of these deadlines did we actually hit? One. The kickoff.
A lot of things went wrong in the process. One of the more significant ones is that, because the Ethereum community had grown substantially since the last upgrade, when it came time to start reviewing EIPs for Istanbul, instead of having only a handful of proposals to go over, the core developers had to wrap their heads around over 30 of them!
This drastically slowed down the process. Not only were there many EIPs, but they were in completely different stages of development (some weren’t merged while others had running testnets) and there were several co-dependent or competing ones.
By the time we had our final list of EIPs for Istanbul, it was the middle of the summer, which was when we should have had our client implementations ready.
As this was happening, a lot of people realized that this process was far from optimal. There were several suggestions about how we could make Berlin, the upgrade after Istanbul, run much smoother.
We’ll now go over some of these suggestions and then combine them into what we’ve called the “Train Station Model”.
Eth 1.x As an Attempt to Change the “Process”
The first person to try and address the process issues during Istanbul was Alexey Akhunov. He wrote a blog post describing how we could form working groups and use ReTestEth as a way to grow the number of people who could contribute protocol improvements to Ethereum, while also reducing the burden on existing core developers.
In the “pre-1.x” process, submitted EIPs would be implemented by client developers from mostly Geth, Parity and Aleth. As this happened, the EIP would be collaboratively refined by these teams. Once clients had agreed on a final specification for the EIP, reference tests would be generated. Aleth was the only client that could do so, so any EIP would have to be implemented in Aleth in order to generate these tests. The EF’s testing team would then run the tests and write the consensus tests against which all clients would run.
This process has several bottlenecks: major client teams for implementations, Aleth to generate reference tests, and the EF’s testing team to write consensus tests.
As an alternative to this approach, Alexey proposed the idea of having working groups. These groups would be composed of people who have a common desire to improve a part of Ethereum. They could work on an EIP from its earliest stage and help refine it to a point where its specification is more or less final. This way, client developers would be working with an EIP that is fairly advanced, likely has at least an initial implementation and whose open questions have mostly been answered.
Along with this, ReTestEth, a new testing tool developed by the EF’s testing team, would enable working groups to generate their own reference tests for their EIP via any client that supports it (currently Geth, Aleth and Hyperledger Besu).
This Working Groups framework would therefore not only make the EIPs refinement process more decentralized, but would also reduce the bottleneck on the testing side.
During the lead up to Istanbul, it was not uncommon to have AllCoreDevs calls where no one was present to speak about a specific EIP. This not only slowed down the development of that EIP, but in cases where there were dependencies or competing EIPs, it would slow down the process for the entire upgrade.
One simple idea to address this, proposed by Alex Beregszaszi during the ETH 1.x Berlin meetings, was to require EIPs to have a single Champion.
The Champion’s role is to be the point of contact for everything related to the EIP. They act as a coordinator for the EIP, someone who is accountable to make sure things are moving forward and that the appropriate people are looped in the right discussions. They aren’t in charge of doing all of the work, and don’t necessarily have to be implementers, but they should be the go-to person for that EIP and commit time to socializing it within the community.
Another idea, proposed recently by Martin Holst Swende from the EF, is to modify our network upgrade process to make it more EIP-centric.
Instead of focusing on the upgrade themselves and ensuring that all EIPs move at the same pace throughout the phases of the upgrade, we should focus on getting EIPs in a shippable state and then, only then, schedule EIPs for upgrades.
If someone wanted to get an EIP from a draft to mainnet, here is how they would go about it under this framework (quotes taken from the original proposal, with some minor edits for readability):
Step 1: Get ACD blessing
Presuming that an EIP exists (step 0), the Allcoredevs would officially decide on whether the EIP is “Initially Accepted”.
“Initially Accepted” (or, ‘blessed’) means that ACD, representing the major clients and ecosystem stakeholders etc. are positive towards the EIP, would accept (well written) PRs to include the EIP into the codebase, so that it could be toggled on for testing…but not with an actual block number for activation
This “Initially Accepted” status would also be a useful signalling mechanism for organizations, like the EF, ConsenSys or even MolochDAO, that fund the teams working on protocol upgrades. Funding could be split into stages (i.e. pre and post “Initial Acceptance”) to ensure that most funds are spent on initiatives that have a high likelihood of going live on mainnet.
Step 2: Implementations
Once ACD has given the go-ahead, developers and/or EIP-authors implements it and makes PRs against the clients.
If implementations are merged into major clients, this milestone is complete.
Step 3: Test cases
Since the feature is now ‘activateable’ within the clients, it’s now possible to produce cross-client testcases for the feature.
The testcases should contain happy-path tests, and quirk/edgecase tests.
This step should be performed in conjunction with many people that have in-depth knowledge not only about the EIP, but also the EVM in general, to get maximum coverage on what should be considered edgecases.
At this point, a security review should be done, and the review items should be fed back into the EIP, under “Security Considerations”. The review should also focus on finding edgecases for testing.
This milestone is completed once the test-team considers the tests to be complete.
Step 4: ACD final acceptance.
At this point, the ACD can again take the EIP up for discussion and evaluate the EIP, the implementations, side-effects, testcases etc.
If everything appears to be in order, the ACD can simply decide when to activate the EIP.
“Yes, let’s activate this EIP on testnet in one month (at block X), and on Mainnet two months from now (at block Y)”. All clients will contain the upgrade at next release, within one week from now, and also functionality to postpone the EIP via command-line flag.
If it should happen that multiple EIPs reach Step 4 simultaneously, the ACD might decide to roll two or three out simultaneously — unless there are concerns that the EIPs might have internal dependencies/couplings which might interfere (or cause additional edgecases for testing).
To represent this new process visually, James Hancock put together a graphic showing how an EIP would move through each of the stages mentioned above to make its way onto mainnet.
Using this framework would allow each EIP to move at its own pace, would reduce the amount of time spent arguing about the time and scope of upgrades, and would relieve some of the pressure on the testing team by providing a more predictable flow of EIPs to test instead of a single burst of EIPs when the upgrade deadline comes along.
One more idea proposed to help smoothen the Ethereum Network Upgrade process was EIP-1872, authored by Danno Ferrin. As more and more companies rely on Ethereum as a core part of their infrastructure, we should aim to make network upgrades happen more predictably.
The EIP proposes to adopt a default schedule around network upgrades, similar to what Microsoft did with Patch Tuesday. This way, people running Ethereum nodes can know with high probability when they should be monitoring for potential upgrades.
In short, the EIP proposes to:
- Have mainnet upgrades happen by default on the 3rd Wednesday of a month, preferably in January, April, July or October, to dodge most major US and Europe holidays.
- Push back upgrades to the next 3rd Wednesday of a month in the case where they are delayed (i.e. as happened during Constantinople)
- Aim to have upgrades happen on mainnet every 6 months for now
- Continue fighting fires whenever and wherever they occur
This EIP would not lock us in to these dates if something extraordinary happens, but would provide strong defaults that can help reduce uncertainty about when the next upgrade is coming (6 months from now) and thus also the time spent by core developers trying to coordinate exactly when an upgrade should happen.
It also offers a “middle ground” between our current approach and the “one EIP, one upgrade” approach proposed by EIP-Centric Forking.
The Train Station Model
Combining various parts of the proposals detailed above, we arrive at a process that is closer to how a train station operates than our current, “closer to how an airport operates”, one.
Today, our upgrades resemble how buying a plane ticket works: the date and time are fixed, and no matter what, we’ll try and get everyone on the plane, especially if they’ve checked luggage, even if that causes delays.
We believe that moving to a model where, instead, people show up to the train station luggage in hand, ready to go, and depart on the next available train will help make network upgrades much smoother than they are today.
Specifically, there are four main components to this Train Station Model.
The first is that EIPs should progress independently. Working groups can work towards moving their EIP forward, and only once it is in a state where it is ready to go live is it scheduled for a network upgrade.
The second is that EIPs & Working Groups need champions. These champions should act as a representative and default point of contact for EIPs. They are in charge of representing it on AllCoreDevs and other forums, and may or may not be the actual implementers.
The third is that what is done is what ships. Whichever EIPs are ready to go-live when an upgrade come around are the EIPs that are part of the upgrade. Anything still in progress simply isn’t scheduled for the upgrade. Similarly, if a last minute issue comes up with an EIP, we move it over to the next upgrade rather than delaying the entire upgrade.
The fourth and last component is that upgrades happen twice per year. By aiming for 6 months between upgrades, we can reduce uncertainty and delays for teams working on EIPs. This way, if something isn’t included in a specific upgrade, a commitment is already made to have a subsequent one a few months after. While the upgrade date is picked well in advance, specific blocks for testnets are picked 8–10 weeks prior to the mainnet upgrade date, and the mainnet block is picked 4–6 weeks prior to the upgrade.
… and there you have it! We hope that this model can provide more efficient, predictable and frequent network upgrades to Ethereum.
Special thanks to Alexey Akhunov, Alex Beregszaszi, María Paula Fernández , Boris Mann, Martin Holst Swende, James Hancock, and all the others who have contributed suggestions about how we can improve the way we upgrade Ethereum.