Blockchain System Consensus: Decentralization

aelf Developer
aelf
Published in
9 min readJul 10, 2020

All blockchain systems are distributed systems, but they are different from common distributed systems. The traditional distributed systems face increasing business volume, using multiple machines to carry business operations to increase their system capacity. When dealing with business criticality, eliminating single points of failure to enhance system availability. When the business scenarios were undertaken by a blockchain system are also as complex as general distributed systems. However, the reason why the blockchain system should be taken seriously is that it can solve the problem of data consistency in the presence of malicious nodes, that is, the problem of Byzantine generals.

In the blockchain world, there is no so-called centralized server. Its composition is a P2P network formed by all enthusiasts, beneficiaries, and others. Any node in the network cannot be trusted directly. Anyone of them might be evil, which is a problem that never be considered in general distributed systems. This is consistent with the assumption of the Byzantium general problem: there is no centralized leadership. When these generals need to make an attack on a certain city, all generals need to agree on the attacking timing. Then the question arises, if the generals decided that the attack time was not consistent, or even that the generals had become traitors, how could those generals reach a consensus?

Similarly, in the P2P blockchain system network, how can all nodes reach a consensus on a certain transaction (that is, based on this transaction to modify the node’s respective database)?

In the 1982 paper The Byzantine Generals Problem, Leslie Lamport proved that when there are no more than one-third of the traitors in the generals, there is an effective algorithm. No matter how the traitors frustrate, the loyal generals can always reach a consistent result. If there are too many traitors, there is no guarantee that consistency will be achieved.

If we assume that the number of malicious nodes in the blockchain P2P network does not exceed 1/3, otherwise the blockchain system construction is considered to fail. In this way, the next most difficult problem needed to be solved is: in a blockchain system with no more than ⅓ of malicious nodes, which data should be selected as the data to reach the final consensus?

Put another way: If a node wants to provide consensus on the data provided by the blockchain system, what does it need to do? He needs to provide a proof to convince the blockchain system to accept the data he provided.

Based on this, we started to discuss how to design a standard consensus interface in a blockchain system.

Standard consensus interface design

Sort out the blockchain consensus process based on the above:

  1. Node A prepares a block and broadcasts it to the P2P network.
  2. After receiving the block from other nodes in the P2P network and after a series of verifications, they decide whether to place the block on the local longest chain.
  3. When most nodes in the entire blockchain system (for example, greater than 2/3) have a consistent block hash value corresponding to local block height, we can consider the blockchain to achieve a consistent block height.

If a service is needed to help Node A and other nodes in the blockchain complete the entire consensus process, there should be roughly two types of services provided:

  1. When A knows nothing, when asking A, the service needs to inform him (in the blockchain world, A uses a public key to uniquely determine his identity) if he can currently generate a block and how can he make the block he produced accepted by other nodes;
  2. For nodes other than A receive a block broadcasted from A, all nodes with open-source can implement a consistent code service to verify whether the block is legal.

If a node passes the verification of this block and proof the block is legal, it is said that all nodes have reached a consensus on the block generated by A. Because the verification service of all nodes use the same logic, all nodes in the blockchain network will have the same attitude to the legitimacy of the block. After all, in this blockchain P2P network, the consensus of this block was being added to the longest chain is predictable.

AElf Consensus Common Interface Standard

From now on, we will design the AElf consensus universal interface based on the two types of services calculated in the “standard consensus interface design”.

First of all, it needs to be clear that these two types of services related to consensus (requesting instructions for block production and verifying new blocks) are read-only interfaces, and the calling itself does not need to modify the blockchain network account information.

Second, these interfaces will actually be called by AELf mainchain code, so its design needs to follow the AELf main chain code logic regarding the production blocks and verification blocks (Of course, even in the mainchain code, these interfaces almost correspondingly appear in the consensus service).

We discuss the two interfaces separately.

Request Consensus Command

Continuing the previous example, it is still the same node A. There is a node that has been synchronized to the current AELf longest chain. The current time is 13:59:56 PM on January 1, 2020. A, as an honest node (without modifying the local mainchain code), a block has just been synchronized (that is, blocks from other nodes on the network were received, the verification was successful, and the local blockchain ledger information was modified). After the best chain (maintaining a data structure of the local blockchain) is updated, an event is loaded on the Event Bus. One of the functions of this event is to remind node A to ask the consensus service (through the relevant event subscription and processing mechanism), what can he do next. When inquiring, A passed his public key to the consensus service.

The core logic of the consensus service exists as a smart contract, because only then can its code be guaranteed to be consistent for every node in the blockchain world (inconsistency means that this node is trying to do evil or hardfork). After complex calculations (or simple calculations) up to several milliseconds, the consensus smart contract feeds back a message to node A. The generation of this information varies depending on the consensus mechanism choice, but no matter what consensus, it should have the following structure:

  • When can A generate blocks?
  • If A can generate blocks, then what posture should A take for further requests: that is, what blocks can A generate under the current consensus. This information is referred to here as additional hints.

What if A cannot generate a block? I insist that in the blockchain world, everyone can actually generate blocks, but if the consensus choice (especially PoS consensus), this blockchain does not want most nodes to have the block producing ability. In this case, you only need to set the time to return to A to 100 years later (it may be a bit exaggerated, but it will always be fine after a few months). As long as A can hang up for a hundred years, and no new block is generated in this century (any valid new block synchronization will make A regain a block generation time: after the new one hundred years). Insistence is the victory.

It is not difficult to imagine how easy it is to implement PoW based on this interface. Time: Immediately. Extra tip: empty.

In AElf main chain, the consensus service will update the consensus scheduler immediately after receive the time information of the consensus feedback (if the previous consensus scheduler is not empty, then the previously unfinished scheduling information is killed and the new time point is used to fill, which can be said that there can only be one unexecuted consensus task in the consensus scheduler, and the consensus scheduler is a singleton object).

Next is the long countdown.

We return to the example of node A. Assume that after requesting the consensus command, A gets a time: 2 PM on January 1, 2020, 4 seconds later. Additional hint: NextRound (this is a hint from the AEDPoS consensus, which means that A will terminate the block generation process of this round and update the block generation order of all proxy block generation nodes in the next round). This means that the scheduler will be immediately updated to execute a production block event in 4 seconds. What do you do in these 4 seconds? If you can synchronize to the blocks sent by other nodes, and these blocks can be verified, then use the best chain to update the handler of this event, and constantly ask the consensus service to request a consensus command (this operation is called in the code TriggerConsensus). Correspondingly, the consensus scheduler will be constantly reset: 3.5 seconds, 3 seconds, 2.5 seconds, 2 seconds, …

Finally, the time came to 2 PM. Node A begins preparing blocks under the control of the consensus scheduler. At this time, according to our previous design, in addition to the block production time that has already played a role, the only information it knows is the additional prompt given to him by the previous consensus service.

Then, in Aelf system, A sends additional prompt information to the consensus service, and (in addition to the package transaction) calls the other two services:

  • Get consensus block header information
  • Get consensus system transactions

As mentioned earlier, the interface that requests consensus commands has a role to try to make the produced blocks pass the verification. In AElf, in a series of verification steps for a block, there are two verifications related to consensus: before execution, verify the block’s header; after execution, verify whether the modification of the consensus contract status is consistent with the information in the block header.

Make an analogy. Regarding these two steps, you can imagine a .NET programmer to sign in to the DNT offline event. He took out a message confirming his participation in the event and showed it to the organizer. This message was similar to the block header. That is to say, if he could not get the message, the organizer’s staff would ignore him. Next, the staff asks the .NET programmer to report the mobile phone number and looks for this number in the roster, which is similar to the verification after the consensus transaction is performed on the blockchain node, only after finish this step, the staff can safely release the programmer.

In summary, we need three interfaces for services such as “request consensus commands”. Protobuf is described as follows:

service ConsensusContract {
rpc GetConsensusCommand (google.protobuf.BytesValue) returns (ConsensusCommand) {
option (aelf.is_view) = true;
}
rpc GetConsensusExtraData (google.protobuf.BytesValue) returns (google.protobuf.BytesValue) {
option (aelf.is_view) = true;
}
rpc GenerateConsensusTransactions (google.protobuf.BytesValue) returns (TransactionList) {
option (aelf.is_view) = true;
}
}

message ConsensusCommand {
int32 limit_milliseconds_of_mining_block = 2;// Time limit of mining next block.
bytes hint = 3;// Context of Hint is diverse according to the consensus protocol we choose, so we use bytes.
google.protobuf.Timestamp arranged_mining_time = 4;
google.protobuf.Timestamp mining_due_time = 5;
}

message TransactionList {
repeated aelf.Transaction transactions = 1;
}

For the chain safety and stability, in ConsensusCommand, in addition to the next block time (arranged_mining_time) and additional hints (hint), it also includes the block time limit (limit_milliseconds_of_mining_block) and the latest broadcast time (mining_due_time). The latter two pieces of information are used as a reference for the block production service. It is used to realize that if a certain time limit is exceeded, the produced block does not need to be broadcast (or even if other nodes broadcast it, it cannot pass the verification.); an empty block is better than disrupting the order of block production.

Block verification

If the request for consensus command is worthy of detailed discussion, the relevant block verification interfaces are not worth to talk at all. Because in essence, verification logic is completely different due to consensus.

The interface itself is not new, one is to verify the block header before the execution of the consensus transaction, and the other is to verify that the state of the consensus modification after the execution of the consensus transaction is consistent with the information promised in the block header. The input parameters of the two verification interfaces are binary arrays, which means that the interface accepts any data, and only the implementer of the consensus needs to deserialize itself in the specific implementation of verification.

service ConsensusContract {
rpc ValidateConsensusBeforeExecution (google.protobuf.BytesValue) returns (ValidationResult) {
option (aelf.is_view) = true;
}
rpc ValidateConsensusAfterExecution (google.protobuf.BytesValue) returns (ValidationResult) {
option (aelf.is_view) = true;
}
}

message ValidationResult {
bool success = 1;
string message = 2;
}

— Join the Community:

· Get aelf News from our Telegram, Wechat and Kakao channels

· Follow us on Twitter, Instagram, Reddit and Facebook

· Read weekly articles on our aelf blog

· Catch up with our development progress on Github or Telegram

· Chat in our Telegram communities -English, 한국, 日本 語, русский, العربية, Deutsch, Italiano, Türk, Español, and Tiếng Việt

· YouTube Channel: aelf

For more information, visit aelf.io

--

--