Incident Report August 25

Technical Description of the Events

Team Nimiq
Nimiq
4 min readAug 27, 2018

--

On August 25, 2018 in the late evening (UTC), the Nimiq Network experienced a short period of no newly produced blocks after block #191361 due to an error in account pruning logic in the miner code, which prevented miners from calculating valid new blocks. The blockchain itself was unaffected and continued after a fix was released without making a fork necessary.

What follows is a detailed explanation of the events. This is purely a technical post since the situation calls for clarity.

What Happened

The Nimiq blockchain stores all non-zero account balances in a Merkle tree, called the accounts tree. To keep this database of accounts as small as possible (it is synced and stored by full and light clients), accounts that transfer all their balance away are removed from the store after the emptying transaction has been processed. This removal of zero-balance accounts is called ‘pruning’ and is done automatically by all miners when processing transactions.

The problem occurred when the same address had two outgoing (sending) transactions waiting for processing at the same time (in the mempool), and the second of these transactions emptied the account to a balance of zero.

When creating a new block, the transactions that will be included in this block are applied virtually to the current blockchain state by the miner. The resulting end-balance for the sender of both transactions was zero. Thus, the sender address was added to the list of to-be-pruned accounts twice (the root cause was that there was no check to prevent an address to be added more than once). This list of to-be-pruned accounts was then iterated over when validating the assembled new block immediately before starting to mine it. During this iteration, pruned accounts are removed from the aforementioned list. An exception is thrown when at the end of the iteration addresses remained in the list. Since the sender address was added twice in the beginning but only removed once, this condition was true and prevented the block from being accepted and consequently halted the work of the miners.

At this point, we think that — while the first triggering of the bug might have been an accident — the subsequent creation of such transactions constituted an intentional attack on the Nimiq Network.

After triggering the bug for the first time, the sender automated the creation of such transactions.

Timeline of Events (UTC)

19:13 — Block #191361 is mined.

19:31 — We receive an alert from our automated blockchain monitoring system about ‘few blocks in the last 20 minutes’. After checking Nimiq’s various block explorers, we see that no block was added to the chain in the last 18 minutes. Block mining is a statistical process and gaps between blocks of a few minutes can occasionally happen, but 18 minutes is a very long time with no blocks, so we start to investigate. Checking our Telegram and Discord channels, we receive first reports of a problem with the miners and their error messages. Checking our team-member-run mining pools shows a hashrate of 0 H/s with plenty of connected devices. At this time it becomes obvious that the blockchain is stuck because miners can not assemble new blocks. We investigate the last few mined blocks for any noticeable transactions and explore how the error that miners are seeing can be explained.

20:24 — We identify the issue and start applying and testing various potential fixes on our mining servers.

20:35 — A working quick fix is found and shared internally for further testing.

20:42 — We commit the fix in the GitHub core repository under the ‘jeff/miner_fix’ branch. We are intensely testing the fix on our own miner infrastructure.

21:07 — First community members notice the GitHub branch with the fix and deploy it to their own miners.

21:26 — A new block (#191362) is mined by nimpool.io, processing all waiting transactions and breaking the deadlock for all miners, thus enabling the blockchain to continue regularly, albeit with a much reduced difficulty.

21:35 — The same combination of triggering transactions is again pushed to the mempool (after block #191383), preventing all non-updated miners from creating valid blocks. Only patched miners are now working. The next valid block is mined after 63 minutes.

21:51 — Beepool’s @Blub releases his mining client update with the fix.

22:29 — SushiPool pushes an updated client that contains the fix.

22:30 — Throughout the night a few more short breaks in mining happen due to only a growing subset of miners being updated with the fix.

01:34 — Skypool announces that it also deployed the fix.

02:30 — We merge the fix to the master branch and release it as version 1.3.1. It is now officially released as stable and announced to the community.

We apologize for any inconvenience that may have occurred due to this complex bug. We are however glad to inform that the fix is now in the latest release of nimiq-network/core and we urge all users to upgrade to the new version v1.3.1.

Team Nimiq

DISCLAIMER: None of the statements must be viewed as an endorsement or recommendation for Nimiq, any cryptocurrency, or investment product. Neither the information, nor any opinion contained herein constitutes a solicitation or offer by the creators or participants to buy or sell any securities or other financial instruments or provide any investment advice or service.

--

--