Cosmos Hub 3 Upgrade Post-Mortem & Takeaways

Leopold Joy
Stake Capital
Published in
4 min readSep 26, 2019

All validators prepared for a standard update of the cosmos hub from cosmoshub-2 to cosmoshub-3 for September 24. This Cosmos SDK update was to enable creating and voting on governance proposals that modify on-chain parameters without halting or forking the network, as well as spending from the community fund.

Following the acceptance of Proposal A to move forward with the high-level changes, both Proposals B and C were rejected by the community due to a variety of different challenges — read about it more here. However, Proposal D was effective and was almost unanimously approved by the community.

In the minutes building up to the chain halt, Yelong of the IRISnet validator raised a concern regarding the value that the gaiad migrate command updates the staking.validators[].consensus_pubkey field to in the resulting genesis.json file. However these concerns were not initially unaddressed in the validators chat and the launch proceeded.

At around 2pm CET the Cosmos Hub 2 chain stopped. As validators began to upgrade it became clear that there was a critical problem. Yelong’s concerns were spot on, the Hub upgrade procedure resulted in the creation of a genesis file with conspubkeys that were incorrectly formatted.

This roadblock rendered it impossible for validators to complete the Hub 3 upgrade. Although eventually, StakeWith.Us did post an effective Docker workaround script, enabling genesis formatting to work — however Hub 3 launch was already postponed by that point.

When the chain did not start one hour subsequent to the halt, debate began to circulate regarding the appropriate course of action. As stated in the upgrade proposal itself, in the event of upgrade failure, validators were to revert to the previous version of the software. However, some protested and alternative courses of action were debated.

Additionally, a number of unknowns existed for validators reverting to the existing Hub 2 chain. Firstly, although the halt was set to occur at block 1933000, block production had continued until block 1933002, and would validators who had not set the halt-height field in their app.toml be able to start up the old chain from 1933002 without issue? If, for example, a validator had double-signed after block 1933000, would it be fair to continue from block 1933002 and punish them even though the governance proposal had stated the chain would end at block 1933000?

Secondly, confusion remained about the exact version of gaiad was required for the revert back to Cosmos Hub 2. Although it rapidly became clear that any version of v0.36+ would suffice.

Eventually, enough validators came online and the chain started up from block 1933002 without issue — and without any double signing having occurred. A number of validators were offline and missing blocks at relaunch, but eventually everyone came online.

A few key takeaways from the failed Cosmos Hub 3 launch:

  • The cosmos team handled the situation well, clearly communicating with the validator community.
  • A significant number of professional validators still exhibit carelessness, even on the live cosmos hub, when it comes to snapshotting data prior to delation. Once the relaunch of Cosmos Hub 2 was decided on, some validators struggled to get a copy of the Hub 2 data, having already cleared all of the chain data.
  • Interestingly, once the allotted window had elapsed to launch Cosmos Hub 3, validators did not unanimously proceed to relaunch Cosmos Hub 2 as stated in the voted upon governance proposal, instead many validators relied on “official” word from the core team.

Focus on what you do best and delegate staking to Stake Capital.

Follow-us on Twitter | Join our Telegram | Smash the clap button 🔥

--

--