A deeper look into the design of Nervos Consensus Protocol
This article is part 3 of “The Making of a Consensus Protocol” series, in previous articles, we explained Why we love Nakamoto Consensus (NC) and Forget about the TPS Competition (why other alternative consensus protocols are not secure enough to be used as a layer 1 consensus).
In this article, we will take a deeper look into the design process of Nervos consensus protocol — NC-Max, a variant of Nakamoto Consensus with higher throughput.
Some background before you start:
Ren is currently a researcher at Nervos (mainly working on the Nervos consensus protocol) and also a PhD student of Bart Preneel (designer of RIPEMD 160, the hash function you use to compute from your Bitcoin public key to your Bitcoin address) at COSIC Research Group KU Leuven (the birthplace for AES, the advanced encryption standard which is used in all of your cell phones). In 2017, after conducting research finding the design flaws in scaling proposed bitcoin unlimited, Ren was invited to intern at Blockstream and worked with Pieter Wullie and Gregory Maxwell. Recently, Ren’s paper “Lay Down the Common Metrics: Evaluating Proof-of-Work Consensus Protocols’ Security” is accepted at IEEESSP (Oakland).
Nakamoto Consensus has a natural throughput limit. If you want to raise the throughput there are two things you can do: the first is to increase the block size (like Bitcoin cash, Bitcoin Unlimited), the second is that to lower the block interval, which will increase the orphan rate. The orphaned blocks consume bandwidth, but not contribute to transaction confirmation. As the orphan rate increase, the security of the system goes down and the throughput goes down.
When the orphan rate is really high it’s easy for an attacker to secretly generate a longer chain. The attacker does not need 51% percent of mining power, it can override the blockchain with much less mining power. Those orphaned blocks consume a lot of bandwidth of public nodes, which negatively affects the throughput of the system.
If we want to break the throughput limit, we must find some way to lower the orphan rate.
The Block Propagation and Orphaned Blocks in Bitcoin
The orphaned blocks come from the block propagation delay, if one block is discovered when another block is being propagated, one of them is destined to be orphaned.
If block propagation takes place instantly, there will be no orphans. How can we make sure that blocks can be propagated fast enough? (Bitcoin developers already put a lot of effort to accelerate block propagation.)
Let’s take a deeper look at why some block propagates slower than others.
Some blocks are propagated slower because those blocks have more fresh transactions - fresh transactions were generated within, say 10 seconds, before it is embedded in a block - and not synchronized within the network, hence miners will have to synchronize those transactions before they can further propagate those blocks.
In the current implementation of Bitcoin, if everything goes well, a block will very likely to propagate through a protocol called compact block, as shown in the picture below:
In a compact block, transactions are not actually propagated as the entire transaction which is 200 or 300 bytes. It is propagated as a compact block transaction id. The transaction ids are compiled into a compact block, which is much smaller than an actual block.
When node A propagate a compact block to node B and there is no fresh transaction in it, node B can immediately transfer those compact blocks to all its neighbors, which is nice.
However, if there are fresh transactions in the block, node B would have to first synchronize those transactions from node A, then verify those signatures of those transactions, which also takes time. Only when the validity of the entire block is validated, can node B keep propagating this block. This is the main source of the Bitcoin block propagation delay.
The Ethereum Approach: A Bad Attempt
Ethereum simply shortens the block interval, but Ethereum has a problem that it is impossible to verify the transaction validity before receiving the actual block. Because the validity of the transaction depends on the order of the transaction within the block, therefore if you want to verify a transaction you must wait until the block is ready.
Thus every transaction in every block is considered a fresh transaction. Ethereum has a poorly maintained network protocol and there has been no update since 2015. There’s a lot of redundancy in transaction propagation.
An Ethereum client will propagate the same transaction seven times to different nodes, which means for each node it will receive the same transaction seven times.
Ethereum has inconsistency in different clients. The two main Ethereum clients are basically separate from each other. Very few nodes are connected in those two different networks, that is why Ethereum has a very high orphan rate of up to 30% sometimes, and very low transaction per second.
The big miners have no incentive to accelerate block propagation, “If my block can be propagated slower, it means I can perform a de facto selfish mining, it’s in my interest, why would I accelerate block propagation?”
And uncle reward surely does not help, because even if mine is orphan I still can get some reward.
Small miners cope with header-first block propagation and empty blocks: they don’t propagate the content of the entire block but just propagate the header because it’s much faster; and if you don’t know what transaction will be in the last block, you probably will mine an empty block to make sure that your block will not contain a transaction in conflict with the previous block — this is bad because those empty blocks do not contribute to transaction confirmation.
The Nervos Approach: NC-Max
There are three major innovations in NC-Max:
1. Two-step transaction confirmation to reduce ophan rate;
2. Dynamic block interval and block reward to best utilize bandwidth;
3. Considering all blocks in difficulty adjustment to defend against selfish mining.
1. Two-step Transaction Confirmation to Reduce the Ophan Rate
As shown in the picture, we modified and increased several fields in the Bitcoin block format:
First, the traditional transaction confirmation zone can only contain transactions that are proposed several blocks before — for example between 2 blocks before and 5 blocks before — only those transactions can be confirmed, new transactions cannot be confirmed in the transaction confirmation zone.
We add an “uncle block headers” zone that allows miners to embed as many uncle blocks as possible into a block. The uncle blocks should propagate their headers and their transaction proposal zone in the block and uncle blocks do not count in block size limit. So miners are not discouraged to include uncle blocks in their block.
We also added a transaction proposal zone, which may contain some fresh transactions and only contains truncated transaction ids, which is a shorter version of transaction id so the transaction proposal zone can be very small. The complete transaction is synchronized after propagating the block. So they are synchronized in parallel, they will not affect the block propagating procedure. And they do not affect the block validity as long as the hash checks.
In the transaction proposal zone, there may have invalid transactions, double spending transactions and miners may refuse to provide some transactions in this transaction zone, those are also fine.
As shown in the two graphs at left, our compact block is slightly larger than Bitcoin’s compact block. The compact block is always propagated immediately to their neighbors.
For the newly proposed transactions, if some of them are missing in your transaction pool, they can be propagated after you send out the compact block. Those are parallel process and don’t affect each other, those transactions are verified and propagated and to the next neighbor.
You may ask:
What if a miner refuses to provide the complete version of proposed transactions?
This will not affect block propagation because blocks are propagated regardless of whether there are fresh transactions in their transaction proposal zone. The other miners will still proceed with mining because there are enough proposed transactions to confirm.
What if miners incorporate these proposed-without-broadcast transactions in their later blocks to gain a de facto selfish mining advantage?
In NC, the advantage in slow block propagation is always useful in finding the next block. For miners, they can only tell the block header and transfer the block really slowly after they find a block, during this process you are the only one who can mine the block.
However, In NC-Max, it can only be used to slow down propagation n block afterward, but only if that block is found by the attacker. Because this propagated without broadcast transactions, only the selfish miners know those transactions, and only the selfish miner can use it as an advantage. However, it can not be used in the next block because there needs to be a gap.
A miner can only mine transactions that are proposed between 2 or 5 blocks before that, you can not mine a transaction proposed in the previous block, but only if that block is found by an attacker.
As the above picture shows, in NC when the selfish miner finds the block, it can immediately start mine the h+1 block, however, honest miners can only start mining after they receive the full block — that’s the selfish miner’s advantage during the block propagation period.
In NC-Max, when a selfish miner finds a block h, an honest miner can immediately start mining block h+1.
If the selfish miner wants to utilize this proposal without broadcasting transactions, it has to find blocks n block afterward, only then can the selfish miner utilize this advantage. However, this happens less often. You can not make sure that after 7 blocks there will be a block mined by me, it’s very hard to predict that.
2. Dynamic Block Interval and Block Reward to Best Utilize Bandwidth
NC-Max uses a different difficulty adjustment mechanism which targets a fixed orphan rate (counted as uncles in the last difficulty adjustment period):
If the orphan rate in the last difficulty period is below the target orphan rate, the difficulty will go down, the block interval will go down and the throughput will go up.
In other words, very few orphans mean that the network can synchronize transactions faster, which means we can increase the throughput without harming the decentralization.
Otherwise the difficulty increase, the block interval will increase and the throughput will decrease. A higher orphan rate means that the network can not process these many transactions in the difficulty adjustment period.
The block reward is proportional to the inverse of the expected block interval, so the expected total reward per difficulty adjustment period is fixed. For example: if we have 10 minutes per block, each block has 12.5 Bitcoin; if we had 5 minutes per block, each block can have 6.125 Bitcoin. So the issue rate of the currency is always fixed.
3. Considering All Blocks in Difficulty Adjustment to Defend against Selfish Mining
In NC-Max, the difficulty adjustment mechanism counts all blocks, including uncles when estimating total mining power.
In NC, without the selfish mining attack, the attacker finds 3 blocks in 10, the honest miner finds 7. With the attack, the attacker finds 3 blocks in 7, the honest miners find 4, 3 honest blocks become orphaned, as the main chain grows slower, the difficulty lowers, the attacker can find more blocks with the same mining power.
In NC-Max, the difficulty will stay the same as the difficulty adjustment mechanism counts all blocks. The attacker can not find more blocks with the same mining power, therefore selfish mining is no longer profitable!
In conclusion, NC-Max makes use of orphaned blocks. We want to reduce the number of orphans by two-step transaction confirmation. After the orphan is reduced, we use the remaining ones as an indicator of bandwidth utilization to adjust the throughput, and given that the uncle information is embedded in the blockchain, we can use them to make selfish mining unprofitable (yeah!)
3. A detailed look into the design of NC-Max, a variant of Nakamoto Consensus with higher throughput.
P.S. NC-Max is a temporary name for Nervos consensus protocol — if you have any good idea on how it can be named please tell us in comments :)