Blockchain scaling solutions 4–1 :: Ethereum Sharding

Decipher, the blockchain research group at Seoul National University, publishes a series of articles on blockchain scaling solutions. The fourth topic of the series is Ethereum’s “Sharding” and will be addressed into two articles : 4–1. Sharding Overview and 4–2.Sharding Q & A.

The purpose of this article is to better understand Sharding, one of the Ethereum’s blockchain scaling solutions. Ethereum had proposed a phased development roadmap of Sharding from phase 1 to 6. As of yet, the most specifically defined phase (phase 1) is retired while other phases are being developed sequentially. (As of June 8, 2018) The development roadmap of Sharding is as follows :

Figure 1.

In this article, I addressed Sharding phase 1 in order to provide a clear overview of Sharding. More details about Sharding will be discussed in 4–2, Sharding Q&A.

Sharding background

Sharding is a solution proposed by Ethereum to address scalability issue of blockchain such as plasma and Raiden networks. While Plasma and Raiden networks are off-chain solutions, Sharding is an on-chain solution which improves the performance of mainchain by changing the protocol of the mainchain itself.

Applying an on-chain solution requires the hard fork of the main network while the off-chain solution eliminates the need for a hard fork by adding other systems outside the main chain.

Moreover, Sharding is designed based on Proof of Stake (PoS) algorithm to support Ethereum’s supposed transition from Proof of Work (PoW) to PoS.

About Sharding

To begin, Sharding is an on-chain solution that divides the entire network to store transactions by sections. These transactions are then processed in parallel to solve blockchain scalability problems. In short, data is divided and stored into separate shards to be processed.

Figure 2.

The concept “Sharding” originates from the database field. In the context of Database, sharding is the process of horizontally partitioning the table to process and store large amounts of data.

Meanwhile, Ethereum sharing divides the main chain into k shards. Each shard handles the entire transactions on the network parallelly. As a result, the overall throughput of the network is improved by multiple shards which is in contrast to the current main chain’s transactions processing mechanism. For instance, suppose there are 100 transactions. With 10 shard chains, each shard chain could process 10 transactions simultaneously on average.

Features of Shard Chain

To better understand sharding, one must be aware with the components and key terms of the shard chain. However, Sharding related components and terminology continue to change as the sharding research progresses. But again, this article is based on the phase 1 specification.

Figure 3.

Collation: Collation plays the same role as a block in the main chain of shard chain. Collation is consisted of a collation header and a transaction list. The collation header contain the information that constitutes the collation and submitted to the main chain via the proposer’s signature. Here, the transaction list is the list of transactions in the collation.

Figure 4.

Proposer : A proposer essentially collect transactions, create a proposal and then submit it to a collator. Also, a proposal is an unverified collation.

Collator: Verifies the proposal submitted by the Proposer. For each period, a collator is assigned to one shard, which is randomly selected before a certain period of time.

Executor: Pass the Collation header to the Sharding Manager Contract (SMC) in the main chain. This changes the actual state of the shard chain. (Executor appears in Sharding phase 3)

Period: The submission period of shard chain’s collation header to the main chain. Unit here is the number of blocks in the main chain. For example, if PERIOD_LENGTH = 5, it is 1 period that generated 5 blocks.

Lookahead period: Collator is pseudo-randomly assigned by the SMC before they validate collation in the shard chain. In the same vein, “Lookahead period” indicates to which shard chain each collator has been assigned before a certain period. For instance, if LOOKAHEAD_PERIODS = 4, it means collator is allocated to the shard chain before 4 periods. Collator can then secure their time to download the state information of the shard chain in advance.

Sharding Manager Contract (SMC)

SMC is a smart contract that plays significant role in shard chain. The SMC connects the main chain and shard chain, manages the collator, proposer, and collation tree. The role of the SMC is essential for the shard chain to participate in the main chain. The main roles of SMC are as follows.

Let’s take a look at the structure of shard chain and the role of the SMC

Figure 5.
  1. PoS system: The SMC manages deposit of collator. If collator did something wrong in the shard chain, then one’s deposit could be slashed by the SMC.

2) Pseudo-randomness sampling: The SMC pseudo-randomly assigns collator to shard chain in the collator pool. This reduces the possibility of collator’s attack on a particular shard by preventing collator from knowing which shard one will be assigned to.

Figure 6.

3) Collation header validation: Verify the collation header submitted by the shard chain. The SMC is validated through the addHeader function and must be validated before collation is enabled.

4) Cross-shard communication: For shard-to-shard transaction transmission, a receipt must be created in the main chain. And the SMC manages this receipt. When a user in a shard chain creates a receipt, the receipt is passed to the other shard chain via the SMC and consumed to transfer the shard transaction. This will be implemented in Sharding phase 4.

5) On-chain governance: SMC plays a central role in on-chain governance. Collators’ votes are processed through the SMC so that these votes can be made on-chain.

Shard Chain Mechanism Process

Now let’s see how the shard chain actually works. I want you to have the components and terminology of the shard chain in mind to understand the mechanism process of Shard chain.

Figure 7.

(1) First, a network participant who wants to become a proposer deposits balance through SMC.

(2) Similarly, a network participant who wishes to be a collator deposits through the SMC.

(3) Collators periodically check the SMC status to see if they are selected as collator.

(4) Collators are assigned pseudo-randomly to each shard chain by the SMC. They download the previous state of the shard during the lookahead period. Each collator receives the proposal bid from the proposer that proposed chosen proposal.

(5) Then Proposer submits the proposal to the collator which includes transactions. (Proposal means a collation that is not yet verified. When a proposal is chosen by Collator, proposer receives transaction fee from the transaction originator.

(6) The collators proceed to a vote to verify whether the transactions in each proposal are valid or not.

(7) If more than 2/3 collators approve the proposal’s transactions valid, then the proposal is a valid collation.

Collator invokes the add_header function and sends the newly created collation header to the SMC after the vote. The collation header uploaded via the SMC is connected to the main chain.

Sharding will add functionality such as executor and cross-shard transactions over the course of the phase. At first we looked at the most basic shard chain operation process and incentive structure. Let’s look at the fork-choice rule of the shard chain.

Fork-choice Rule of Shard Chain

Like the main chain, there is a problem with choosing which chain to branch in the shard chain. The current main chain has a fork-choice rule that selects the longest side when a branch occurs. However, the Fork-choice rule in the shard chain is more complex.

In the case of sharding, basically both the main chain and the shard chain are long. That is, 1) choose the longest shard chain, 2) the main chain takes the longer one. Let’s take a closer look at the figure below.

Figure 8.

Sharding In Phase 1, the fork choice rule in the shard chain depends on the longest main chain. That is, when a branch occurs, the choice of a shard chain should not be simply the longest shard chain, but the longest shard chain in the longest main chain.

For example, in Figure 8, the main chain containing block B3 is the longest chain. Therefore, it can be seen that block B3 is valid and collation C3 is valid. Where score is the height of the block or collation.

Figure 9.

As shown in Figure 9, when the block B3 ‘is added, the above chain and the underlying chain are tied together. In this situation, according to the fork choice rule in the main chain, the valid blocks at the time of tangency are randomly determined.

At this time, block B3 is selected as valid block, and collation C3 belonging to B3 is valid collation.

Figure 10.

In Figure 10, a new block B4 ‘is added to the bottom chain. Now the bottom chain is the longest main chain. In this case, when we compare the scores of collation C3 and collation C2, collation C3 score> collation C2 score. This means that the shard chain with collation C3 is the longer chain of shards. However, the choice of the shard chain is dependent on the main chain. The collation included in the longest main chain is collation C2, not collation C3. Therefore, although the score of collation C2 is lower, collation C2 becomes a valid collation and is chosen as a valid shard chain. This is the Fork-choice Rule of the shard chain.

In this article, we have focused on phase 1, where concrete specs are presented to help understand overall sharding. As mentioned earlier, the scalability problem in the block chain must be solved. Sharding is very interesting in that it is an on-chain solution and is designed to be applied after Etherium’s PoS transition.

In the case of sharding, there is no white paper, so the official document is limited. The Etherium Foundation is still in the process of shaping sharding, and information about sharding is stored in various places, such as prysmaticlabs, ethresearch, and github. If you want to learn about the latest sharding information, you need to update continuously.

At the time of writing this article, Sharding continues to develop. I will continue to study sharding and upload articles to Decipher Decipher(@decipher-media).

I addressed more details in the next article, Sharing Q & A and code review.

--

--