If you use the Internet, it’s highly likely that you recall the various times that AWS has gone down. With major sites like Reddit, Netflix, Instagram, Airbnb, and others completely dependent on AWS for their infrastructure, it is pretty hard to miss the various failures over the years. And even though these systems were architected with load balancing and various measures taken to mitigate such disasters, there’s really nothing that can be done when the hosting provider goes down.
While these failures may seem reasonably spaced out and minor, they have huge implications in a world continuously being eaten by software. When you consider a future where autonomous cars’ software is run on this sort of architecture, failure (even if just for a minute) is completely unacceptable. And it’s this future mindset of disintermediation that has led the blockchain movement to take hold in an attempt to improve on the existing standard.
And while many scoff at the idea of providing uptime guarantees that exceed those of Amazon’s AWS, this is not a bold of a claim to those in the decentralized community. In fact, Bitcoin just hit 10 years of nonstop block production — something very few services as large can tout.
But anyone inside of the community is well aware of the scaling problems innate to such systems — especially when using Proof of Work as a means by achieving consensus. So, as the transition to more efficient consensus mechanisms slogs on, we see the emergence of second layer solutions taking value from one of the main Proof of Work chains and using that to fund state channels, plasma, sidechains, etc…
And each of these still has their caveats in terms of security, efficiency, capital lockup, and data availability. At SKALE, we have built a scalable sidechain system to serve as a decentralized scaling solution for Ethereum.
In this article, we detail how we have managed to address these concerns by combining best in class proven research with world-class engineering to create a consensus mechanism which can run over 20,000 TPS in a standalone environment and thousands of TPS in a fully connected Ethereum environment with average blocktimes of 1–2 seconds.
Over time, the SKALE Network will support a variety of consensus mechanisms. At Network Launch, though, SKALE will be supporting the consensus protocol outlined below.
Note: If you have not read our brief technical overview, read it for more context before continuing.
The protocol assumes that the network is asynchronous with eventual delivery guarantee, meaning that all nodes are assumed to be connected by a reliable communications link. Links can be arbitrarily slow, but will eventually deliver messages.
The asynchronous model described above is similar to Bitcoin and Ethereum blockchains. It reflects the state of modern Internet, where temporary network splits are normal, but eventually resolve. The eventual delivery guarantee is achieved in practice by the sending node making multiple attempts with exponential backoff to transfer the message to the receiving node, until the transfer is successful.
Each sending node maintains a separate outgoing message queue for each receiving node. To schedule a message for delivery to a particular node, it is placed into the corresponding outgoing queue. Each of these outgoing queues is serviced by a separate thread, allowing messages to be delivered in parallel so that failure of a particular node to accept messages will not affect receipt of messages by other nodes.
Note: Each user transaction is assumed to be an Ethereum-compatible transaction, represented as a sequence of bytes.
A node is required to create a block proposal directly after its TIP_ID moves to a new value. TIP_ID will be incremented by 1 once a previous consensus round completes. TIP_ID will also move, if the Catchup Agent appends blocks to the blockchain.
To create a block proposal, a node will:
- Examine its pending transaction queue.
- If the total size of transactions in the pending queue is less than or equal to the MAX_BLOCK_SIZE, the node will fill in a block proposal by taking all transactions from the queue.
- In the case that the total size of transactions in the pending queue exceeds MAX_BLOCK_SIZE, the node will fill in a block proposal of MAX_BLOCK_SIZE by taking pending transactions from queue in order of oldest to newest received.
- The node will assemble block proposals with transactions which are ordered by SHA-3 hash from smallest value to largest value.
- In the case that the pending queue is empty, the node will wait for BEACON_TIME, and then, if the queue is still empty, make an empty block proposal containing no transactions.
MAX_BLOCK_SIZE: The maximum size of the block body in bytes. Currently, we use MAX_BLOCK_SIZE = 8MB and may consider self-adjusting block size to target a particular average block commit time in the future.
BEACON_TIME: The time between empty block creation. If no-one is submitting transactions to the blockchain, empty beacon blocks will be created. Beacon blocks are used to detect normal operation of the blockchain.
After block proposal creation, the creating node will send both the block proposal and the hashes of the transactions which compose the proposal to the rest of the network. Upon receipt, the receiving node will reconstruct the proposal from hashes by matching hashes to messages in its pending queue. For transactions not found in the pending queue, the receiving node will send a request to the sending node. The sending node will then send the bodies of these transactions to the receiving node, allowing for the receiving node to reconstruct the block proposal.
Note: Nodes do not remove transactions from the pending queue at the time of proposal. The reason for this is that at the proposal time there is no guarantee that the proposal will be accepted.
After the proposal P has been reconstructed, the receiving node will add P to its proposal storage database and send a receipt back to the sending node along with a signature share for P. The sending node will wait until it collects signature shares from a supermajority (>⅔) of nodes (including itself) and then will create a supermajority signature S. The sending node will then broadcast this supermajority signature S to each of the other nodes in the network.
Note: Each node is in possession of BLS private key share PKS[I]. Initial generation of key shares is performed using Joint-Feldman Distributed Key Generation (DKG) algorithm which occurs at the creation of the S-Chain and whenever validators are shuffled.
In further consensus steps, a data availability receipt is required by all nodes voting for proposal P whereby they must include supermajority signature S in their vote. This ensures that any proposal which wins consensus will be available to any honest nodes.
Once supermajority signature S has been created for proposal P, Asynchronous Byzantine Binary Agreement (ABBA) is initiated for block finalization. For each of the N supermajority signatures created, ABBA is executed where nodes vote yes or no to whether or not they have the proposal and supermajority signature in their proposal storage database.
Upon completion of all ABBA instances:
- A vote vector containing yes or no for each proposal is created.
- If there is only one yes vote, the corresponding block proposal P is committed to the blockchain.
- If there are multiple yes votes, P is pseudo-randomly picked from the yes-voted proposals using pseudo-random number R (generated from our randomness beacon). The winning proposal is the modulo of R by N_WIN, where N_WIN is the total number of yes proposals.
- In the rare case where all votes are no, an empty block is committed to the blockchain. The probability of an all-no vote is very small and decreases as N increases.
The consensus described above uses an Asynchronous Binary Byzantine Agreement (ABBA) protocol. We currently use a variant of ABBA derived from Mostefaoui et al. Any other ABBA protocol P can be used, as long as it satisfies the following properties:
- Network model: P assumes asynchronous network messaging model described above.
- Byzantine nodes: P assumes less than one third of Byzantine nodes.
- Initial vote: P assumes that each node makes an initial vote yes(1) or no(0)
- Consensus vote: P terminates with a consensus vote of either yes or no, where if the consensus vote is yes, it is guaranteed that at least one honest node voted yes.
At this point, a new block has been successfully committed to an S-Chain. To finish up, we’ll review the case of reboots and crashes in the network as well as our Catchup Agent.
Reboots and Crashes
During a reboot, the rebooting node will become temporarily unavailable — for peer nodes, this will look like a temporarily slow network link. After a reboot, messages destined to the node will be delivered — this protocol allows for a reboot to occur without disrupting the operation of consensus.
In the case of a hard crash where a node loses consensus state due to a hardware failure or software bug that prevents the node from being online, its peers will continue attempting to send messages to it until their outgoing messages queues overflow — causing them to drop older messages. As such, we target messages older than one hour to be dropped from message queues.
While a node is undergoing a hard crash, it is counted as a Byzantine node for each consensus round — allow for <⅓ of nodes to be experiencing hard crashes simultaneously. In the case where >⅓ nodes experience a hard crash, consensus will stall, causing the blockchain to possibly lose its liveness.
Such a catastrophic failure will be detected through the absence of new block commits for a set time period. At this point, a failure recovery protocol utilizing the Ethereum main chain for coordination will be executed. Nodes will stop their consensus operation, sync their blockchains, and agree on time to restart consensus. Finally, after a period of mandatory silence, nodes will start consensus at an agreed point.
A separate Catchup Agent continuously running on each node is responsible for ensuring that node’s blockchain and block proposal database are synced with the network. The catchup engine is continuously making random sync connections to other nodes whereby any node discovering that they have a smaller TIP_ID than their peer will download the missing blocks, verify supermajority threshold signatures on the received blocks, and commit them to its chain.
When the node comes online from a hard crash, it will immediately start this catchup procedure while simultaneously participating in the consensus for new blocks by accepting block proposals and voting according to consensus mechanism but without issuing its own block proposals. The reason for this is that each block proposal requires the hash of the previous block, and a node will only issue its own block proposal for a particular block id once it has finished the catch up procedure.
With such an agent running on each node, nodes having experienced a hard crash will be able to easily rejoin in block proposal after re-syncing their chains.
Want to Learn More?
Want to stay in the loop with the SKALE Developer community? Find a team for the event? Get feedback? Learn more about SKALE? Join our discord which is the home for SKALE Dev convos.
SKALE’s mission is to make it quick and easy to set up a cost-effective, high-performance sidechain that runs full-state smart contracts. We aim to deliver a performant experience to developers that offers speed and functionality without giving up security or decentralization.