The best way to attack PoW blockchains for profit

Blockchain systems are run by honest participants, e.g. the miners in case of proof-of-work blockchains like Bitcoin. A rational but malicious participant, only caring about his own profits rather than the health of the network, can choose between participating honestly or attacking the network e.g. via a double-spending attack. How should he choose? Here’s the guide, a high-level summary of the research paper On the security and performance of proof of work blockchains by Arthur Gervais.

What does a participant stand to gain from either honest or dishonest behavior? In the case of honest mining, i.e. verifying blockchain data, the profits are the block reward and transaction fees. They should exceed costs for hardware and operation of the hardware. In the case of malicious behavior, he stands to gain any amount he can double spend if his attempt to double-spend is successful. In my recent posts, I explored how the proof-of-work consensus mechanism secures blockchains against attacks. We have learned that an attacker, even with less than 50% of the mining power but some luck, can make a block with malicious information appear to be legitimate for a short time (e.g. have the double-spending transaction receive multiple confirmations) — until the rest of the network catches up and produces a longer chain with the correct information. So a malicious participant does not only stand to gain the double-spent amount if his attack is successful, he will also lose any block rewards he could have made by honest mining rather than focusing his mining power on creating a consensus chain to make his double-spend appear legitimate.

If you haven’t caught up with consensus building on proof-of-work blockchains, check out my last post and sign up for the newsletter to never miss out on an article again.

What is the right strategy to employ? When to try double-spending instead of mining, and how much effort to put into confirming the double-spending transaction? To come to a rational decision, the adversary does not only need to factor in his own mining power, but on factors outside their control: the stale block rate of the network, the mining power of the other participants, mining costs including up-front hardware costs and running costs such as electricity, his own communication connectivity in the network (which controls the degree of distribution of (block) announcements, and therefore the speed of the announcement and the coverage of the network knowing about it) are among the vital factors.

At each time step (e.g. whenever the next block is published), the adversary can choose between several actions, including most importantly to (continue to) mine honestly, to try a double-spending attack, or to abandon an attempt of double-spending. Each of these actions changes the “state” of the world, e.g. after an attempt to double-spent the (relevant) state of the world is “there is a double-spend attempt with 0 confirmation to defend”.
The right tool to reason about a world with “states” and “discrete events” that happen with a certain probability is a Markov Decision Process (MDP). MDPs are a mathematical model to reason about the best possible policy, that is, what sequence of actions to take to maximize a goal (i.e. the monetary gain). An MDP models multiple states the world can be in, and actions that model how transitioning between the states happens. It models a probabilistic view of the world, so each possible transition can happen with some probability. Moreover, some actions might cause a reward or loss to occur. Figure 1 shows a graphical depiction of a Markov Decision Process.

Figure 1: A graphical depiction of an MDP model with states s_0, s_1, S_2 and action a_0, a_1 as well as two rewards of -1 and +5. (Figure created by MistWiz on WikiCommons).

In the research paper mentioned in the introduction, Arthur Gervais built MDPs for a rational attacker and asked what the attacker should do to successfully double-spend with some fixed number of conformations, i.e. append multiple blocks to the fork of the chain the contains the double-spending transaction.

What does the analysis tell us? For each fixed number of confirmations k the attacker wants to double-spend, there is a threshold he needs to exceed in double-spending. Otherwise, honest mining is more profitable. This outcome is nicely illustrated by the following graphs. The x-axis shows how the mining power of the attack influences the threshold: fix a value for it and find the corresponding value of the curve of interest on the y-axis. Different parameter values k (the desired number of confirmations) lead to different curves.

The y-axis in Figure 5 shows how many successive blocks need are expected to be mined before a double-spending attack to can be successful: For an adversary with ~30% mining power and accounting for 6 confirmations, the expected number of blocks is roughly 100.

Figure 5 taken from Arthur Gervais paper.

Figure 10 shows the reward needed for a double-spending attack to make sense: The expected reward for fraudulent behavior must be larger than the reward for honest mining. The y-axes shows the required reward from fraudulent behavior as multiples of the block reward, i.e. multiples of the reward of non-fraudulent behavior.

Figure 10 taken from Arthur Gervais paper.

The figure also contrasts Ethereum and Bitcoin, both systems using proof-of-work algorithms to establish consensus on the blockchain. The key difference in the proof-of-work mechanisms between these two blockchain systems is the block time, i.e. the duration between the generation of two blocks. With shorter block times, the stale block rate increases: The time between finding two blocks is much lower in Ethereum, and thus two participants finding a block within a very short time span happens more often, cause stale blocks. The network then needs to continue mining, and eventually, one continuation will lead to a longer chain (see the purple vs the black blocks in the graphic below). The discarded block is called a stale block.

For a double-spending attack to be the rational option for an attacker controlling 20% of the network, the reward must be about 100 block rewards on the Bitcoin network if the attacker wants 6 confirmations for his transaction (the solid line).

What’s the takeaway? We’ve known early on that attacks on blockchain consensus systems are possible — but the numbers for a rational attacker confirm that our systems are fairly safe: Unless there are huge amounts of money/Bitcoins to be stolen (and successfully profited off within the short period of k=6 blocks, 60 minutes), it is more economically to honestly participate in the network. The paper by Arthur Gervais is also one of the first to conduct a formal analysis of a rational attacker: It provides us with a valuable tool to reason about system security — despite the implication that you can use it as a guide to attacking blockchain systems. The paper also analyses another type of attack, selfish mining. Having a mathematically sound analysis of attacks helps to strengthen blockchain system against malicious attackers successfully.


Liked the article? Give it a recommendation, let me know what you think in the comments, and sign up for my newsletter below.