THORChain
Published in

THORChain

Post-mortem: ETH Router Exploits 1 & 2, and premature Return To Trading Incident

The ETH Router Exploit 1 & 2, Premature Trading, fixes and network response, as well as the 5 Pronged Response.

Summary

THORChain suffered two back to back exploits on its ETH Router. The first took all the ETH from the system via an attack contract that sat in front of the Router, and the second took all the economically significant ERC20s via an attack contract that sat behind the router.

In both cases the exploits were able to trick the Bifrost into reporting receiving assets it had not. The root cause was a Bifrost interface that did not fully account for the degrees of manipulation that can occur in smart contract events.

No other chains or assets were affected.

The THORChain team and community have kicked off a 5-Pronged Plan to address, fix and recover. They are detailed below.

The THORChain treasury will cover all losses to LPs. Nodes are not affected.

Exploit 1 — ETH

The attacker deployed a contract that sat in front of the Router, which was able to call the deposit() function of the Router. The ability for the Router to be wrapped was recently made available to support ecosystem development. The full scope of this was not assessed thoroughly at the time.

The attack contract simply diverted the msg.value back to themselves, calling with a value of 0 into the Router. The Bifrost read the msg.value instead of the emitted deposit event. This is necessary to support Router upgrades, but should not have been for deposit events.

The fix was to enforce that for a deposit action, only the deposit event is read.

Attacker Wallet: 0x3a196410a0f5facd08fd7880a4b8551cd085c031

Contract Address: 0x4a33862042d004d3fc45e284e1aafa05b48e3c9c

Tornado Address: 0x4b713980d60b4994e0aa298a66805ec0d35ebc5a

A full write-up is available here:
https://thearchitect.notion.site/THORChain-Incident-07-15-7d205f91924e44a5b6499b6df5f6c210

Impact — $8m

The attacker deposited fake ETH into the contract many times, swapping to other ERC20s, artificially raising their prices and paying large amounts of fees. They then finally were able to siphon out the ETH by forcing a refund (using a deliberately bad memo).

In all they siphoned ~4200 ETH from the system, and caused a huge spike in arbitrage volumes.

Premature Return to Trading

The network was rapidly halted by nodes to limit the impact. Around 700 ETH was retained in pending outbounds. A subsequent update was then released to purge these outbounds and save the 700 ETH. The system thought it had 13,000 ETH, but it only had 700 ETH, so the update also contained a store migration to correct the balance. This store migration would cause the price of ETH in the system to go from $350 (13k ETH) to around $7000 (700 ETH), to reflect the actual pool balances.

The plan was to complete the update and allow arbitrage to sell the ETH down from $7k to $2k (actual price), so the brief to the admins was to enable trading after the upgrade.

The upgrade process was not adequately war-gamed. The upgrade instructions should have been to restart thord , update, then immediately shut the Bifrost service down. This is because there is a narrow window of time from 67% updated to 100% updated where the old logic still applies, but the network is operational. Ideally a mimir should have been in place to halt signing programmatically.

LP Withdrawals

Once 67% had updated, the network restarted and began processing txIns . What wasn’t planned was that ETH LPs began withdrawing asymmetrically to ETH take advantage of the fact that they were getting a claim on 13k ETH, when there was only 700 ETH.

The correct response to this was to ask Nodes to shut down their Bifrosts to stop the withdrawals (the system was rapidly becoming insolvent), OR, have in place a mimir to halt withdrawals. This mimir setting hadn’t been built because of the system’s philosophy to never block withdrawals.

In the heat of the moment the on-duty mimir admin incorrectly inferred that the response was to enable trading to correct the ETH price to stop the abuse from ETH LPs. This was as per the brief, but it was premature, since the ETH price hadn’t yet been updated from the store migration. The end result was that trading being re-enabled caused arbitrage agents to buy cheap ETH, instead of selling expensive ETH. By buying the cheap ETH, the remaining ETH in the system was taken and the network went insolvent.

Nodes were asked again to halt.

Exploit 2— ERC-20s

The fixes to the issues above were then put in place and pushed out to the network. The fix also contained an ability to halt an entire chain and programmatically stop withdrawals. The fix also contained logic to divert the arb transactions to the treasury since the system was insolvent and could not fulfil them. This required the Bifrosts to be online to restart.

What was unknown at the time, was that there was another critical vulnerability in the ETH Router. The attacker created a fake router, then a deposit event emitted when the attacker sent ETH. The attacker passes returnVaultAssets() with a small amount of ETH, but the router is defined as an Asgard vault. On the Thorchain Router, it forwarded ETH to the fake Asgard. This creates a fake deposit event with a malicious memo. The Bifrost intercepts as a normal deposit and refunds to an attacker due to a bad memo definition.

Contract Address

Transaction 1

Transaction 2

Transaction 3

Transaction 4

Transaction 5

Transaction 6

Last Transaction By An Attacker

Router: 0xc145990e84155416144c532e31f89b840ca8c2ce

Vault: 0xf56cba49337a624e94042e325ad6bc864436e370

Attack contract: 0x700196e226283671a3de6704ebcdb37a76658805

Attack wallet (spawned from Tornado Cash):0x8c1944fac705ef172f21f905b5523ae260f76d62

Impact (~$8M USD)

  • 966.62 ALCX
  • 20,866,664.53 XRUNE
  • 1,672,794.010 USDC
  • 56,104 SUSHI
  • 6.91 YFI
  • 990,137.46 USDT

5-Pronged Recovery Plan

The problems above have simple solutions, but the real question is why, not really how.

It is unrealistic that THORChain will ever be free from attack, so big picture thinking is needed, beginning all the way from the code to the live network. Why were critical vulnerabilities in the code for so long, why were they abused by black hats before white hats, why was THORChain able to send out so much of the TVL so quickly, why didn’t the system react faster.

They can be summarised as follows.

Problem 1: The ETH Bifrost Code was unaudited

The THORChain state machine and the BNB Bifrost Code was audited as part of Single Chain Chaosnet, but the updated MCCN state machine and its new MCCN Bifrosts were not. They were scheduled in with TrailOfBits, which unfortunately had not begun at the time of the first Exploit.

Fix: Stop and Audit. Both Trail of Bits and Halborn Security are underway with two simultaneous audits.

Problem 2: There was no Official Bounty Program.

As part of Single Chain Chaosnet a bounty program had been released, but it was not refreshed as part of MCCN. This was overlooked. Thus there were no clear incentives and campaigns for white hacks to be onboarded and find vulnerabilities.

Fix: Commission a Bounty Program with Immunify.

Problem 3: There is no ongoing “Red Team”

THORChain is an inside-out exchange. Exchanges have active security teams, even with locked-down proprietary exchange engines. THORChain needs a 24/7 continuously run Red Team to line-by-line each new PR, as well as actively monitor the network.

Fix: Commission a Red Team with Halborn Security.

Problem 4: THORChain has no active security monitoring

THORChain’s autonomous and decentralised nature was its own sword it died on. It happily executed attack transactions and there was nothing anyone could do. The only response was for all nodes to shut down their machines.

Fixes:

  • Automatic Solvency Checker to halt as soon as a solvency is detected (pro-actively and re-actively)
  • Node Operator Timeout— any node can call to time-out the network for 25 mins if they suspect anything. This gives an ability for each of the 36 Node Operators to timeout an attack when they observe it.
  • Outbound Throttling — the txOut queue is throttled to artificially delay the settlement of transactions when there are sudden spikes.

Problem 5: There is no Protocol Insurance

Whilst the treasury is able to cover the insolvencies, the treasury won’t exist forever. The solution is to insure all non-RUNE TVL with a DeFi Insurance Provider, using collateral and income from the system’s own reserves.

Fix: Engage with DeFi Insurance Protocols to attempt to insure the entire protocol.

Treasury

The network has a ~$16m insolvency to deal with. The plan is:

  1. 1/3rd ($5.3m) will be directly contributed from the treasury assets
  2. 1/3rd ($5.3m) will be loaned from Iron Bank using RUNE collateral and paid off later
  3. 1/3rd ($5.3m) will be arbed into the network after it is brought back online for trading.

To fund (2) and to partially cover (1), a large Public Fund Raising event will be commissioned after the network is operational, in the vicinity of $10m-$20m. This will be planned and executed in the public domain when the time comes.

Return to Operational.

The above Fixes need to be in place prior, they will take 2–3 months to set up fully.

However the network can be brought online in stages once enough of the code is thoroughly checked and the bounty program has solicited enough of the major bugs (if any). The guided timelines are:

  1. Network Restart (send RUNE, Bond, receive Block Rewards) — early August
  2. BNB Chain online — August
  3. UTXO Chains online — September
  4. ETH Chain online — October

The Overview is detailed extensively here:

Gantt Chart:

Mainnet

Assuming all the Fixes are in place, the network is bought back online and is solvent, and can achieve stability, the timeline to Mainnet should be expect to be EoY 2021 or early 2022.

Mainnet is simply the definition that the network is stable and secure.

Community

To keep up to date, please monitor community channels, particularly Telegram and Twitter:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store