SDBS #19 | Tuning qPoS Blockchain Reorganizations

Stealth
stealthsend
Published in
5 min readApr 27, 2019

Stateful aspects of block validation are delayed as long as possible • Staker continue production when disconnected

In this week’s update, I describe improvements to Stealth’s Quantum Proof-of-Stake (qPoS) blockchain reorganization. Unlike other blockchains, which have only one type of reorganization, qPoS has two types. QPoS shares the first type of reorganization with other block chains. I call this type a U-turn. The second type of reorganization is called a rollback. While U-turns and rollbacks are conceptually straightforward, the conditions under which they should occur are nuanced and require careful tuning of the decision logic.

— — — — — — —

Two Types of qPoS Blockchain Reorganizations

Stealth’s qPoS has two types of blockchain reorganizations, illustrated in Figure 1.

Figure 1: Transfer and Rollback Blockchain Reorganizations

The top panel in Figure 1 shows an example of a U-turn. In this example, the chain turns back from the losing chain (represented by capital L) to the winning chain (represented by capital W). U-turn reorganization can be conceptualized as a process of removing then adding blocks. In the example in Figure 1, the chain starts at block a. Blocks are then removed going back to block b, then added forward going to block c. This concept is similar to how one may think of going the wrong way after a fork in the road. To fix the wrong turn, first go backwards on that road, then make a U-turn at the fork to go forward in the right direction. A rollback is even simpler, seen in the bottom panel of Figure 1. Here the client realizes that block d is on the wrong chain, then removes blocks going back to block e, and then waits for new blocks that may be on the correct chain, adding at block e.

U-turns and rollbacks result from very different decision logic. When a client makes a U-turn, it must have two different chains to compare. In Figure 1, these chains are W (winning) and L (losing). A client decides the winning chain by comparing a metric known as chain trust that indicates the reliability of a chain. For proof-of-work (PoW), the most reliable chain has the most work. For proof-of-stake (Pos), the most reliable chain uses the most coin•age, which is the product of the number of coins and the time since they were last used to produce a block in a process called staking. For qPoS, the most reliable chain is one that has the best record of validator performance, quantified by a metric called weight. Staker weight is discussed in the whitepaper.

QPoS rollbacks occur when a chain has no chance of winning in the case of a fork. Because the full set of active QPoS validators (called stakers) is known, it is straightforward to determine if a chain could never win a fork. Specifically, if a chain does not have participation from at least 51% of known validating weight, it cannot continue. This metric for participation is called power. I introduce power in SDBS 16.

— — — — — — —

Brief Review of The qPoS Asynchronous Network Clock

Before I talk about this week’s improvements, I should quickly review the asynchronous network clock, discussed in detail in SDBS 17. This network clock is equivalent to the staker queue, wherein stakers are scheduled to sign blocks. The queue’s time updates on new blocks, advancing to the timestamp of each new block. Timestamps come from the hardware clocks that are synchronized to what I call official time (time given by government computers). New blocks have timestamps that come from these hardware clocks. The queue keeps time because nodes disregard blocks whose timestamps seem out of order relative to other blocks. An important aspect of this system is that qPoS nodes don’t compare any timestamps to hardware clocks, as hardware clocks can be unreliable. In short, block timestamps are only compared to other block timestamps to assess the validity of blocks. A somewhat surprising result of this approach is that the excellent timekeeping of the asynchronous network clock (the queue) is an emergent property of the timestamp validation protocol, not of directly synchronizing the queue to any specific clock.

— — — — — — —

Key Improvements to Reorganization This Week

I spent a lot of time this week debugging difficult reorganization scenarios, where the entire network would find it exceedingly slow or even impossible to heal from fracturing. Fracturing is when the network of computers (nodes) breaks into individual nodes because of communication errors. I describe the testnet I use and how it is purposefully designed to be very problematic in SDBS 16. This week, two design principles were discovered that address many catastrophic reorganization events.

First, all new blocks can only be validated provisionally unless they are linked to the head block. Provisional validation means that any features of the block that rely on a synchronized registry must await their connection to the lead block. This connection happens after U-turns, and could happen after rollbacks in some cases. A potential exploit of the lack of full validation is that blocks could spoof stakers and be submitted without valid signatures, creating a type of denial-of-service (DoS) attack. I am working on ways to mitigate this exploit.

Second, in the event of a network disruption, stakers continue to create blocks and add them to their blockchains because it is impossible for a node to immediately know whether it is isolated or whether one or more of its peers have gone down. In these events, the blocks added to their own blockchains may orphan valid blocks received from the network. When these orphans pile up, the situation eventually leads to a rejected valid block that triggers a rollback. This specific type of rollback will orphan all new blocks. When a staker orphans all new blocks, it becomes stuck. The general solution to this situation is to test whether the rejected block would be a valid block on an alternate chain (i.e. treat these types of rejected blocks as orphans). Then, test whether this chain is more reliable than the node’s present chain. Finally the protocol would reorganize to this alternate chain if it is more reliable than the present chain.

— — — — — — —

Hondo

— — — — — — —

Website / Telegram / Slack / Medium / Twitter / Reddit

--

--

Stealth
stealthsend

World’s first private high performance blockchain protocol