Amazonas Upgrade Guide

Ethan Frey
Regen Network
Published in
6 min readSep 23, 2019

This is a guide for the second upgrade of the public regen testnet, code-named Amazonas. If you have done the first upgrade, it is very similar idea, we just smoothed out some rough edges, so this should be a nicer experience. If you are new to Regen, please check out the plans for the first upgrade and the feedback on how to improve future upgrades.

Photo by ViniLowRaw on Unsplash

If you remember the first upgrade, the major changes we’ve made are to use block height instead of block time, and to prevent the new binary from running too early. Just as the old binary refused to run once it hit the specified block, now the new (v0.5.0) binary will refuse to run before that certain block, which is a safer way to avoid upgrade problems. Furthermore we’ve removed the optional callback to a bash script, as it did not work well in real production setups.

Instead of having xrnd call to a script, we now have built a supervisor daemon, cosmosd, which can launch any cosmos-sdk binary using the upgrade module. It transparently wraps the current activated version, just as nvm allows us to switch out the version of npm and node. However, it also watches the output of the wrapped daemon, and when it requests an upgrade, cosmosd looks to see if the named binary is installed and switches it out automatically. This should allow node operators to prepare the upgrade ahead of time, then sleep soundly, while cosmosd handles the switching. There are full usage details in the repo, and we will explain it in more detail below.

Option One: Manual Upgrade

Before we get to the details of automation, we will first explain what we will want to do manually, so the automation makes sense. You will want to download and uncompress the new v0.5.0 release of xrnd from GitHub or build it from source. The updated xrnd binary should be present on your validator machine well before the upgrade time. The upgrade is scheduled for block 1722050 which is approximately 13:00UTC on Thursday, September 26.

When block 1722050 is created by consensus and passed to the xrnd state machine for processing, the current v0.4.1 binary everyone is running will detect that this is time for a governance-declared upgrade, and automatically pause processing. That means it will not be processed and consensus on 1722050 will not begin. The xrnd daemon will print out a message looking something like:

ERROR: UPGRADE "amazonas" NEEDED at height 1722050: https://gist.githubusercontent.com/... module=main

Once you see this message and note that the daemon is not producing any more blocks, you should stop the currently running v0.4.1 (“El Choco”) binary, copy the v0.5.0 (“Amazonas”) binary to the same location, and restart the service. (You did prepare the Amazonas binary on your validator ahead of time, right?). Upon restart, you should see a message something like:

INFO: Applying upgrade "amazonas" at height 1722050 module=main

This means you have successfully executed the upgrade migration and are now ready to continue the testnet. Your node will process block 1722050 and try to make consensus for 1722051 at this point, but until 67% of the validators have completed the upgrade and reconnected to the p2p network, consensus will not proceed, so do not be alarmed by no new blocks, just wait patiently for the rest of the network.

Please do run xrnd version to ensure you are running the proper version of the software, and in no case run two copies of xrnd at once. If you have any issues, validators and developers will be online in the Regen Network DVD telegram channel and can help answer any questions.

Option Two: Automatic Upgrade with Cosmosd

Now to explain the upgrade manager. We made a cosmos upgrade manager called cosmosd to handle the switching step automatically for you. Assuming you set up the binaries properly ahead of time, this will automate the execution of the binary switch described above at the actual upgrade time. This is new code that has not been run in a live upgrade yet (except for one-node dev nets), so we suggest that if you use this approach to still be awake and take a look at the output of the validator computer when the time comes. Hopefully it works by itself, but if it fails, you can revert to doing the manual upgrade and then please give us valuable feedback on what went wrong on your machine, so we can make it more robust for the future. We only tested this code on our local setups, and operation conditions can vary quite a bit.

The first step is to take a look at the usage guide for cosmosd, which explains all environmental variables and the directory structure. This is a great reference if any of the following doesn’t make sense. Second, we assume you run xrnd as a systemd service. This is not necessary, but we will refer to that below, so adjust as needed. Finally, we assume you currently have a running version of v0.4.1 “El Choco” synced to the blockchain and want to do a horizontal migration to using cosmosd.

First step: set up the binary directory

You will need to create a folder to hold the various binaries. Let’s call this DAEMON_HOME and set it to $HOME/.cosmosd unless you have a better idea. You need to make various directories under this one and set some binaries there:

export DAEMON_HOME=$HOME/.cosmosd
mkdir -p $DAEMON_HOME/upgrade_manager
cd $DAEMON_HOME/upgrade_manager
mkdir -p genesis

You can then manually build the binaries and place them there, v0.4.0 in genesis, v0.4.1 in el-choco, v0.5.0 in amazonas. Or you can download some prebuilt linux binaries via curl. Here is how to get v0.4.1:

mkdir -p upgrades/el-choco/bin
cd upgrades/el-choco
curl -L -o bin/xrnd https://github.com/regen-network/regen-ledger/releases/download/v0.4.1/xrnd-v0.4.1
chmod +x bin/xrnd
./bin/xrnd version # this should print 0.4.1
cd ../..

And here v0.5.0:

mkdir -p upgrades/amazonas
cd upgrades/amazonas
curl -L -o regen.tar.xz https://github.com/regen-network/regen-ledger/releases/download/v0.5.0/regen-ledger-v0.5.0-linux-amd64.tar.xz
tar xzf regen.tar.xz
./bin/xrnd version # this should print 0.5.0
cd ../..

We can leave genesis empty here, if you really start at the beginning of the chain, please place the proper binary there, and cosmosd will work great. However, we are assuming the node is already on el-choco, so we just set the current release to el-choco and cosmosd will continue from there

# make sure to set the current link, if not starting from genesis
ln -s $DAEMON_HOME/upgrade_manager/upgrades/el-choco current

Add the cosmosd binary to run it:

Get a prebuilt copy of cosmosd or build from source:curl -L -o cosmosd https://github.com/regen-network/cosmosd/releases/download/v0.1.0/cosmosd
chmod +x cosmosd

And then we make sure it works well with this directory structure:

export DAEMON_HOME=$HOME/.cosmosd
export DAEMON_NAME=xrnd
./cosmosd version
# should output 0.4.1

Note that cosmosd passes through all arguments and environmental variables to the currently selected binary. It also prints out the stdout and stderr streams, so you can use it in place of the normal xrnd binary. Just make sure you tell it where --home is (so the xrnd binary can find it’s config). Once you have done this, you can now modify the systemd service. I will assume it is called xrnd.

sudo systemctl stop xrnd# open the service file in an editor
# add the following two lines to xrnd.service
Environment=DAEMON_HOME=<path you set up above>
Environment=DAEMON_NAME=xrnd
# change ExecStart
ExecStart=<path to>/cosmosd start
sudo systemctl start xrnd

Your xrnd.service file should look something like this at the end (please modify directories for your setup):

[Unit]
Description=Regen Xrnd
After=network-online.target
StartLimitIntervalSec=0
[Service]
User=root
Type=simple
ExecStart=/root/.cosmosd/upgrade_manager/cosmosd start
Environment=DAEMON_HOME=/root/.cosmosd/
Environment=DAEMON_NAME=xrnd
Restart=always
RestartSec=3
LimitNOFILE=4096
[Install]
WantedBy=multi-user.target

Check the log files, but if you set everything up properly above, it is now running the v0.4.1 el-choco release exactly as before, but now managed by cosmosd. This means when cosmosd sees the critical line:

ERROR: UPGRADE "amazonas" NEEDED at height 1722050: https://gist.githubusercontent.com/... module=main

It can now automatically stop the process, update the current binary link to amazonas, and restart. This will trigger the migration and continue consensus several seconds later, all while you are slowly sipping your coffee.

Feedback?

I hope you enjoy this new tool, and I encourage many of you to try out cosmosd, and many others to do the manual change as well… diversity breeds resilience. A big thank you for everyone whose feedback shaped these new tool, and safety checks, and I’m looking forward to seeing a smoother testnet upgrade this time.

Keep an eye on the block height and see you Thursday for the Amazonas upgrade. As always, please join the Regen Network DVD telegram group for any development or validator related questions.

--

--