MESS Testing Results Report

Created by OpenRelay

Ethereum Classic Labs

Published in

etc_core

10 min readOct 13, 2020

Background

The ETC Artificial Finality working group has been working to establish a mechanism to prevent large 51% attacks from enabling well-funded attackers from executing double spends over the course of several hours. We have evaluated several different options, ultimately settling on a variation of Modified Exponential Subjective Scoring, or MESS.

Options Considered

The working group discussed and tested several different possible options, including reorg caps, permapointing, flash pointing, continuously probable reorganizations, greedy mining, and more. All of these solutions had relatively high complexity, had possible vulnerabilities for attackers to exploit, were generally hard to quantify in terms of the protection they offered, and ultimately introduced an “edge of eternity” risk, where an attacker could potentially bifurcate the network in a way that it cannot recover without operator intervention.

MESS avoids most of these issues fairly elegantly. MESS simply provides an additional, subjective weight to consider beyond the total difficulty of a chain segment. The subjective weight increases with the amount of time since the common ancestor of the two chain segments, but never reaches a point where reorgs are impossible. This gives an easily quantifiable level of protection while at the same time avoiding “edge of eternity” scenarios.

Testing

During the early stages of the discussion, while we were exploring several different alternatives, we developed a Python-based modeling framework where we could very quickly evaluate different solutions. Implementation of different scenarios typically required only a few lines of code and allowed us to evaluate the effectiveness of different protection strategies, as well as test hypothetical attacks to circumvent these protection strategies.

Once we settled on MESS as the best option of those evaluated, we set up a test network known as “Messnet.” Messnet consisted of 151 servers spread across 39 different data centers operated by 2 different cloud service providers on 5 different continents. This allowed us to deal with realistic block propagation times and test peer-to-peer interactions comparable to a real-world network. The network used a “fake PoW” block sealing mechanism with variable pseudo-random Poisson-distributed block production rates, which allowed us precise control over the hashing power of all nodes across the network. The tests could be observed in real-time online at mess.canhaz.net.

Based on the MESS artificial finality curve, we were able to calculate how much hashpower an attacker needed to have for a successful attack over a specified period of time. Our test runner is a program that orchestrates both the attackers and defenders in the tests, to ensure that the attacker was able to hit the relative difficulty targets. This revealed a handful of issues which required client updates to resolve. For the changelog of updates, please see the Testing: Changelog section.

Peering Requirements

In the early implementation of MESS, two features caused an unexpected feedback loop. The first was a limitation that MESS would only be enforced on nodes with at least 10 peers; the idea being that until a node has sufficient peers to see a diversity of blocks, there is not sufficient information to enforce artificial finality. The second was that if a node received a block that failed the artificial finality checks, it would disconnect from the peer that provided the block.

In Messnet, many nodes had peer counts very close to 10. This lead to the following behavior:

Node has 10 peers.
Node receives attacker’s block from a peer.

3. Node rejects attacker’s block for failing artificial finality, and drops the peer.

4. Node now has 9 peers, and MESS is disabled.

5. Node accepts the attacker’s block and forwards it to other peers.

6. Other peers drop the node for forwarding a bad block. Now they have fewer than 10 peers.

7. The block propagates through the network as nodes fall below the 10 peer threshold when they reject the artificial finality block.

We ultimately made two changes to resolve this feedback loop. We lowered the number of peers for MESS enforcement to 5, which is generally a safer number to assume healthy nodes will have. We also stopped dropping peers for forwarding blocks that fail the artificial finality threshold, to avoid the receipt of an attacker’s block creating the conditions for block acceptance.

Block Validity Evaluation

In the early implementation of MESS, a block which failed artificial finality would be rejected as an invalid block. This meant that later blocks in the same chain would never be evaluated, as they depended on an invalid block. This meant that the MESS threshold must be overcome by the first block to exceed the current chain’s total difficulty, which is impossible beyond more than a few minutes. This introduced an unintentional “edge of eternity” vulnerability, where an attacker could potentially split the network right at the point where the blocks could not overcome the MESS threshold.

This was resolved by considering blocks that fail the artificial finality threshold to be valid blocks, but not having sufficient total difficulty to be considered the best available block. This means that these blocks will still be propagated through the network, but stored as sidechains. If those sidechains eventually become heavy enough to overpower the artificial finality threshold, they will become the dominant blocks, eliminating the edge of eternity attack vector.

Use of Current Block Time Instead of Proposed Block Time

In earlier versions of MESS, we used (proposed.Time — commonAncestor.Time) as the input to the curve function to determine the antigravity value; this is the amount of time since the proposed block (attacker’s block) diverged from the block shared with the current block (honest chain’s block). We observed a test where we expected the attacker to fail instead succeed. Upon investigation, we realized that it had succeeded because an earlier block had overcome the antigravity curve, even though the most recent blocks would not. By itself, this was not terribly concerning, but we then realized that proposed.Time is a value the attacker has some opportunity to manipulate. By incrementing each block on the attacker’s chain by 1 second instead of the number of seconds that elapsed in real time, the attacker would create fewer blocks and likely achieve a lower total difficulty score, but the difficulty curve they need to overcome would be much smaller, potentially allowing an attacker to defeat MESS with less hashing power than indicated by the curve.

By changing the input to the curve function to be (current.Time -commonAncestor.Time) we achieve similar results to what were intended, while suppressing the attacker’s ability to manipulate the input to the antigravity function.

Test Results

For the final version of the client, we conducted tests under the following conditions:

In this table, the finality threshold is calculated from the test duration and the finality curve. Because block propagation takes time and the actual difficulty ratio varies a bit given the probabilistic nature of the tests, we set the attacker’s target difficulty ratio high enough to leave about two minutes for block propagation before the finality threshold would no longer be met on tests we intended the attacker to win, and undershot by about two minutes for tests we intended the attacker to lose.

As you can see from the table, despite attackers achieving total difficulties higher than the defenders’ chain segments, the attacker succeeds only when the actual difficulty ratio exceeds the finality threshold.

Bifurcation Tests

The possibility of network bifurcation may be considered the riskiest side effect of subjective arbitration. A network partition must be expected to split the “honest” mining power of the network. This causes lower relative hash rates as well as a schizophrenic honesty. It is expected that the MESS algorithm effectively minimizes and resolves bifurcations consistently and predictably. In order for network bifurcations to persist under MESS, the network must be manipulated to an extreme and precise balance of segregated hash power.

Tests were run under the following conditions:

An “evil” adversary who produced network-valid blocks, but would refuse to import those of honest miners.
An honest miner whose block production rate and timing exactly matched that of their adversarial counterpart. The two produced blocks in lock step.

Despite myriad variations and trials, it became apparent that so long as the honest miner is willing to mine on top of the adversary’s blocks — and thus “tip the balance” — it would be practically impossible to reproduce even in the controlled environment. If two evil miners alone controlled the network’s block production, bifurcations could be produced consistently, as expected.

Testnet Client Changelog

Desired and/or necessary updates to the core-geth client facilitated by MESSnet testing are listed below.

V1: Initial implementation of MESS based on etclabscore/core-geth:master, including the additional “fake PoW” logic enabling pseudo-random but variable-controlled block production rates following a Poisson distribution without the CPU demands or limits of real Proof-of-Work.
V2: An API addition `admin_maxPeers` facilitating better control over node relationships.
V3: An API addition `miner_mustEtherbases` facilitating an “evil” mode for antagonist peers by installing artificial consensus incompatibilities even for network-connected nodes.
V4: Adjustments to MESS enable/disable safety mechanisms; lowering MinFinalityPeers from 10 to 5, and raising head staleness check intervals from 130 to 390 seconds. MinFinalityPeers was originally set to 10 (double the existing MinSyncPeers default), but was reevaluated as both too conservative for the test network and the main network. The head staleness interval was raised in response to a concern raised in discussion on Github (here), which notes that a too-low staleness check interval might be used by a sophisticated attacker in an edge-case exploit.
V5: Added MESS arbitration logic to core-geth’s `writeKnownBlock` method to fix an implementation bug which allowed nodes processing antagonist chain segments multiple times to eventually adopt them.
V6: Implemented an upgraded MESS algorithm from the original sin curve using `float64`variable types to one proposed via https://github.com/ethereumclassic/ECIPs/issues/374#issuecomment-694156719 which uses integer maths implemented with `big.Int` variable types. This change also refactored the implementation to more precise arbitration condition sites.
V7: Changed to use `current.Time` instead of `proposed.Time`, preventing attackers from being able to manipulate the finality curve

Ramifications of the Finality Curve

The MESS artificial finality curve means that the amount of hashing power an attacker must provide to achieve adoption by existing nodes on the network grows dramatically with longer reorganizations. The following table shows the finality threshold at several different points in time. In order to be successful in executing a reorg for the specified amount of time, an attacker must be able to sustain the hashing power greater than the rest of the network times the finality threshold for the duration of the attack.

Under this model the hashing power required for an attack is multiplied by the finality threshold relative to the cost required for an attack absent MESS.

The cost of executing an attack in dollar terms can be represented as:

H n * F D * D * C

Where:

● H n : The hashing power of the network

● F D : The finality threshold for a given duration

● D: The duration

● C: The cost per second of hashing power1

Organizations such as exchanges that are relying on finality can generally assume that an attacker won’t spend more money executing an attack than the value of the transaction they are trying to revert. Thus if a transaction is valued at $100,000, the transaction can be assumed to be approximately final when D grows high enough that H n * F D * D * C exceeds $100,000. So if H n = 1.98 TH/s, and and C = $2,000 TH/s/hour, then a $100,000 transaction can be assumed to be finalized after 9,116 seconds (2.53 hours), as beyond that point sustaining an attack will cost more than the doublespent transaction is worth.(2) A $1,000,000 transaction can be considered final after 29,326 seconds, or 8.15 hours under the same formula. It’s worth noting that there are several other factors impacting the total cost of a double spend; mining rewards can help offset the costs, while news of a successful attack may impact the price of the doublespent tokens and reduce the reward. These factors are harder to work into a simple equation, but organizations relying on finality may want to pad estimates to account for these variables.

(1) For this formula, we assume a static price for hashing power. In practice, hashing power is sold on a market place, and each additional unit of hashing power is likely to be more expensive than the last.
(2) It should be noted that attackers may revert multiple transactions with a single double spend attack, so exchanges should be cognizant of their total outstanding exposure, not just the exposure created by individual transactions.

Conclusion

The implementation of MESS raises the cost of executing a successful double spend attack in a manner that is both significant and predictable, giving organizations confidence in their reliance on the Ethereum Classic Chain. The July 31st attack had a duration of 12 hours; if MESS had been in place the attacker would have required 31 times more hashing power than they needed for the original attack. Bitquery estimated that the attack cost $192,000, and that they likely doublespent $5.6 million worth of ETC. Assuming the attacker had been able to acquire 31x more hashing power at the same price point, the attack would have cost $5.9 million to execute, so the attacker would have had a net loss. In practice, each additional unit of hashing power gets more expensive, so even if the market could provide 31x more hashing power (which does not appear to be the case), it would have been at a much steeper price point.

The impact that MESS has on the cost of executing attacks is substantial enough to make large attacks impractical, and gives organizations like exchanges a simple formula for evaluating their risk tolerance, allowing them to continue supporting ETC within timeframes that are practical for conducting business.