Killer Whale Attack: Recovering a hi-jacked EOS chain

CryptoLions
5 min readSep 9, 2018

OVERVIEW

The Jungle Testnet went down because someone abused our faucet and elected non producing BPs. This isn’t really a threat vector for the Mainnet, because there’s no faucet, but the recovery was good practice for a worst-case scenario event.

WHAT HAPPENED

On Friday, September 8th, just before the CryptoLions team boarded a plane to return to Ukraine from a business trip to Germany and Amsterdam, we discovered the Jungle Testnet was down.

Apparently, several thousand accounts registered on the Jungle Testnet and accessed our Jungle EOS faucet. They probably used a script. Accounts had names like “msb3ddwuvbmh” and “ms5mwjpydtif”. The total number of Jungle EOS distributed was about 60 million.

The attacker used these Jungle EOS to vote in a bunch of non-producing Block Producers. They were newly registered BPs, probably part of the attack.

Yes, the door was open for such an attack, but hey, it’s a testnet. It’s here to be stressed and broken. Here’s a very funny article from EOS Metal about how we broke the Jungle Testnet repeatedly before the launch of the main net.

WHAT WE DID

We considered simply restarting jungle again, after over 13M blocks. There’s a strange transaction in block #1310498 which makes resynchronizing annoying. BPs resynchronizing from the beginning must be downgraded to version 1.1.0 while adding that block. It’s a pain.

Restarting the chain would have cleaned that up, but doing so would have been a missed opportunity for some practice.

Tristan from Block Genesys, a long-time part of the Jungle Testnet community was enormous help throughout. It was his idea to recover the chain from backup. It was a fork, essentially, from the moment of the backup. As far as we know, it would be the first time such a thing was attempted on a major testnet.

(It would be really cool if the attacker re-started the non-forked chain and called it “Jungle Classic”.)

We’d made a backup before the update to 1.2.4, approximately 12 hours before the attack.

We stopped nodes.

We created new EOS keypair to use it it as key to allow connect other peers
In config:

allowed-connection = specified
peer-private-key = [“!!CREATED_PEER_PUB_KEY!!”,”!!CREATED_PEER_PRIV_KEY!!”]
# add trused peers
peer-key = “PUB_KEY”

When we re-started, we were careful to only allow producers taking part who were on the forked version. We changed the peer list and everyone changed their config file:

allowed-connection = specified 
peer-private-key = ["!!CREATED_PEER_PUB_KEY!!","!!CREATED_PEER_PRIV_KEY!!"]
peer-key = "EOS7hfccXQ8odW7VWM4YrLWENbytKf8j3yPFS8XaZepGRB4UudHpt"
peer-key = "EOS6VV8ckDAFaXQ49v46dRhSnUsJp7umg6iUaaxQM8Jx3xEn1eWWG"
peer-key = "EOS8Jz1HyztUmg5kyduaQDBqxyEHkXputrK82E6yhZ6bXQkNPZevK"
peer-key = "EOS8WEaLDWnnBbRdYdefHzqumi1Jx3NEquk7wdTkEQyfa3iJXGxqN"
peer-key = "EOS7r8AgEiiCj4DwRK5MYqkpV4AfVZJUeTkHKkQvVYcoLFLz7qPz4"
peer-key = "EOS8C1cwEe3siBCTt7DETksEErvE4R8Y39RdGuQyBv1GrufH5d5Yc"
peer-key = "EOS8k77CLBkL44g8mPmgNXa9ctAm2a4fZCtLfjYwussyLnEupK2zs"
peer-key = "EOS6JGkzhKP9tnyMiHK3u8wscNYhaNiXp8bCjBgxsL8h1kUEvDDos"
peer-key = "EOS6fLjpVkJ4o5HDQcVmJKsX3SNJzyeLjRtoTnahHwVKtoQno51Fb"
peer-key = "EOS5WKL9nsJBZy19FWbxpLq7Rpv3obNJyFJxDWKT42y8BjL86tUGb"
peer-key = "EOS6qQfLGoFoGzAmyTPyuyNrF1KjSosDpHMfjT9dwYvLRBirC4n4G"
peer-key = "EOS7vbuUkMTsn17eLSs4LQZUCr5V8H2ovFBAy2yMaFVpbHkmnmTVz"
peer-key = "EOS87gwptFFoywK3nBo2VvZLZ3a4EpryyAsigwTC4xnmaEH9FsmQm"

We also created a checkpoint indicated a block that is unique to the forked chain. This is the first block in the fork.

checkpoint = [13433968, "00ccfc70c981d1fea9a647eb856468feaef5ff6b6b85155e22283f6d4beb8b0b"]

Synchronization lasted from two to five hours, depending on the BP.

The producers resynchronized. We kept connections private until the forked chain was higher than the bad chain.

KEY LESSONS

for EOS:

* Recovering a chain from a backup works. You need to manually limit your resynchronization to BPs who are committed to the fork.

* Hard Replay also worked.

We realized that all the legitimate transactions between the time of the backup and the time of the attack were lost. Good thing it’s just a testnet.

A better approach to forking the chain from a previous state might be to do a hard replay until the offending block.

./start.sh --hard-replay --truncate-at-block …

We tried this with our history node.

for Jungle:

* It’s probably a good idea for us to limit access to our faucet and/or to account creation. (As of now, we’ll probably just add Captcha to the faucet.)

* Might also be a good idea to simply have more tokens voting, so that someone would need to get even more from the faucet.

* Participation remains important. HUGE thanks to Tristan from Block Genesys, Josep from EOS Barcelona, Xavier from EOS Costa Rica, Bart, Alex, Fenix from AtticLab, Xing, Dominik, Nate from Aloha EOS, Dioni from EOSMetal, Todd, Jacky from JEDA, John, Michael from EOS DAC, and everyone else whom we may have missed.

REMAINING QUESTIONS

* Aside from doing a hard replay with the --truncate-at-block, there may be another approach using the --fix-reversible-blocksflag which recovers the reversible block database if that database is in a bad state.

* Part of testnet operations is active communication and responsiveness from node managers. This remains one of the biggest challenges of running a testnet.

🦁🦁🦁

CryptoLions is an EOS Block Producer based in Ukraine. We strive to make EOS more valuable by building projects that improve the ecosystem, by setting the standard for transparency and accountability, and by popularizing EOS all over the world.

website: http://cryptolions.io/
github: https://github.com/CryptoLions
telegram: https://t.me/CryptoLions_io
steemit: https://steemit.com/@cryptolions
twitter: https://twitter.com/EOS_CryptoLions
medium: https://medium.com/@roar_65307
youtube: https://www.youtube.com/channel/UCB7F-KO0mmZ1yVBCq0_1gtA

--

--