Post-mortem: ETH Router Upgrade
The ERC-777 re-entrancy bug, trading halt, Router Upgrade, USDT rescue, return to solvency and operational, improvements
On 9th July 2021, a whitehat discovered a vulnerability in the THORChain router when dealing with ERC-777 tokens and worked with the team to rescue the funds back to the THORStarter deployer account. The Router was patched and deployed to testnet, but during testing an issue when dealing with non-standard ERC-20’s was found, specifically USDT. A recovery plan was posed to the community, which was to rescue the USDT back to the treasury (after a 24 hour notice) and the rest of the tokens migrated via the normal Router Upgrade path. Complications arose since simultaneously the Testnet was found to not be operational, so a new one had to be established. Ultimately the router was upgraded and trading resumed.
A simple bug, but not one contemplated at the time the Router was deployed or reviewed by the THORChain community for MCCN launch. An audit of the Router was scheduled to be performed by TrailOfBits in July, which will go ahead but now on Router V3.
The root cause is that ERC-777 allows hooks to be called on transfers. This is the same as sending ETH to a contract (a hook can be called). The hook called a secondary deposit into the router, which caused the funds to be partially “double-dipped” and the member credited more allowance than they should. The member called recursive deposits until their “allowance” was the full Router balance. They then transferred this out.
The fix was to add a simple re-entrancy guard.
A trading halt via mimir was then established since un-aware users could deposit XRUNE (to sell) and have them immediately stolen by copycats.
An obvious improvement to this is to have a chain-specific halts instead of global halts. This was obvious, since the entire network was not operational when it was just one chain that was problemetic.
This has now been prepared and due for release:
Resolve “ADD: chain-specific halt trading”
The THORChain core devs do not run any bonded mainnet or testnet nodes. They only run full-nodes which make services available to the community (such as thornode.thorchain.info).
Multichain Testnet operators use it to practice churning. There is a known issue that Nodes are slightly ETH insolvent (due to gas issues) and cannot unbond. This is known to cause friction for Node Operators (perhaps they need better tooling) causing some of them to give up and abandon their nodes instead of fixing the issue and leaving gracefully. Due to unfortunate timing, more than 1/3rd of the network was offline/abandoned and could not be used for testing.
This is unlikely to happen on mainnet — no sane operator would abandon a $3m+ bond.
Re-establishing a new testnet cost the team a day.
Whilst testing the Router, another issue came to fruition — the USDT asset could not be migrated since it was not compatible with the ERC-20 standard.
The fix was to remove the ERC-20 interface, encode it directly in the function and simply require
success — this skips looking for any returned data.
Since USDT was not compatible with the preferred migration path, there were three options posed to the community:
- Ragnarok all USDT LPs and migrate with no USDT
- Recover the USDT to a treasury address, migrate manually
- Recover the USDT to a smart contract, migrate trustlessly (anyone could call the subsequent deposit)
The community ultimately settled on (2) as the safest and fastest way to pull off the recovery since the intent was to re-establish swaps, and not a test of decentralisation.
Normally Router upgrades are designed to be trustless and autonomous. As soon as a new Router is added to the network, the nodes will churn towards it using existing churn logic. They pass the router into their call data, and the smart contract *itself* will migrate forward; see the code below.
However, this upgrade was urgent, and required additional steps, such as rescuing USDT and re-depositing USDT and XRUNE back in. As such, special logic was added to allow mimir to orchestrate the process. The exact 11 steps as below:
1. Mimir set `StopFundYggdrasil` to `1` , this will stop funding yggdrasil vaults
2. Mimir set `MimirRecallFund` to `1` , this thet network will ask yggdrasil to send all their fund back to asgard on ETH chain
3. Double check all yggdrasil vaults return their erc20 tokens , if all yggdrasil vault only has dust ETH left in it , then go to step 4)
4. Mimir set `MimirWithdrawUSDT` to `1` to extract USDT , this will trigger the USDT outbound, make sure the outbound has been send all USDT to designate address
5. Mimir set `MimirUpgradeContract` to `1` , this will trigger the network to upgrade contract to new one
6. Make sure all yggdrasil contract has been updated to the new contract address
7. Mimir set `ChurnInterval` back to 43200 so node churn can be kicked off
8. Once churn kicked off , make sure the new asgard has the new contract, and migrate is migrating fund to new contract
9. When churn finished , credit USDT and XRUNE back to new asgard, new contract using memo `noop`
10. Mimir set `StopFundYggdrasil` to 0 , the network will go ahead to fund yggdrasil vaults
11. Once ygg funded, Mimir set `HaltTrading` to 0 , the network will be available for swaps
Return to Operations
The team and community worked together to orchestrate this:
1) Sync the upgrade to tip
2) Release the upgrade and wait for 100% of nodes to adopt
3) Coordinate the Router Migration
4) Re-establish trading
This was finally finished on 14 July — 5 days after the halt.
The list of improvements:
- RouterV2 has a re-entrancy guard, compatible with ERC-20s that are non-standard, and has a new function requested by the community to expire deposits
- THORChain community has validated the Router Upgrade logic and will be prepared to more readily migrate to Router V3 (with more advanced logic).
- THORNode will have a chain-specific trading halt logic, which would have kept the network mostly operational
- Re-establish Testnet and engage with high-quality node operators to run the testnet.
In addition, the RouterV3 will be audited by both Halborn and TrailOfBits soon.
To keep up to date, please monitor community channels, particularly Telegram and Twitter:
- Twitter: https://twitter.com/thorchain_org
- Telegram Community: https://t.me/thorchain_org
- Telegram Announcements: https://t.me/thorchain
- Reddit: https://reddit.com/r/thorchain
- Gitlab (primary): https://gitlab.com/thorchain
- Github (secondary): https://github.com/thorchain
- Medium: https://medium.com/thorchain