Understanding complexities in Blockchain Sharding, and TomoChain’s direction for implementation
In simple terms, sharding is about parallelizing the transaction processing. Generally speaking, it is done through clustering the whole blockchain network into sub-networks, each of which takes in charge of processing a sub-set of transactions. Illustratively, sharding is similar to a country with multiple banks where each bank is responsible for a subset of user transactions (internal transactions), instead of having a centralized bank that verifies transactions of all users.
As a mandatory requirement, a blockchain sharding solution must be able to provide the same functionalities as a non-sharded blockchain: an account (i.e. bank user) should be able to transact with any other account in the network. In a decentralized application infrastructure, i.e. Ethereum, Zilliqa, TomoChain, the statement can be broken down into the following specific requirements:
- Any account should be able to transact with any other account, regardless of the respective shards these two accounts are within.
- Any account should be able to transact with any other smart contract/decentralized application (DApp) regardless of the respective shards the account and the contract/DApp are within (even if transaction execution might fail because of permission control in smart contracts).
- Any smart contract should be able to call functions of any other smart contracts regardless of the respective shards the two contracts are within
Besides these user functionalities requirements, security is mandatory to any sharding solution. It is unacceptable for the system to lose user funds or to allow double spending.
Designing and implementing such a sharding solution is a daunting task that requires sophisticated and well-verified validation to coordinate all nodes in the network to work in a decentralized way. In particular, state sharding and network sharding are two sharding directions in the mainstream.
- Network sharding requires a node in the network to store the full chain state
- State sharding requires a node to store only a portion of the chain state, which results in more lightweight chain data size.
Both are struggling with a very difficult problem: cross-shard transaction in order to provide a seamless user experience for inter-shard transactions.
The state-of-the-art of sharding implementation in Ethereum and Zilliqa
Zilliqa has been following network sharding, which is capable of linearly scaling the transaction throughput. In Zilliqa, there is only one chain for transaction data, which is stored in all shard nodes. Cross-shard transfer is easily done as follows:
- Account A in shard SA sends 10 Zils to account B in shard SB
- A’s balance is subtracted and B’s balance is added in SA and the transaction is broadcast to all other nodes
- All other nodes then update the balance of A and B
Even though all nodes having the same state ease the cross-shard transfer, Zilliqa sharding seems to struggle with the cross-shard smart contract call, which results in conflicting states in different shards. Let’s illustrate with an example:
- Account A in shard SA and account B in shard SB call the same function in smart contract S in shard SS
- Transactions from A and B are executed concurrently in shard SA and SB and S’s state is updated in SA and SB, respectively.
- SA receives the transaction from B and overwrites S’s state in SA
- SB receives the transaction from A and overwrites S’s state in SB
Because of different sequential orders of transaction execution in SA and SB, S will have different states in SA and SB. Solving this problem is a non-trivial task in network sharding. Indeed, Zilliqa proposes to have a special shard for processing all smart contract calls. This solution will cause load balancing problem where the smart contract processing shard has to accept much more load than other shards, which in turn cause network congestion problems if a large number of DApps are deployed onto the chain.
State Sharding like in Ethereum seems to mitigate the smart contract congestion problem because smart contract load is probabilistically distributed equally to all shards. State sharding also reduces the chain data size required by a node as it only needs to store a portion of the chain state. Unfortunately, due to different states stored by different shards, cross-shard smart contract calls and data availability become extremely difficult to handle.
Let’s show an illustration before generalizing the rationale why cross-shard transaction and data availability are very hard problems in sharding. The following figure shows an example of a cross-shard transaction scheme which deals with the transaction fee issue. In fact, the scheme should work fine in normal cases in which the transaction is executed successfully in shard 2 and transaction fees are enough for processing.
Nevertheless, the complexity lies in abnormal cases where the transaction fee is not enough. In this case, gas refund transaction will not be executed, which results in lost funds for the user because the fund is subtracted in Shard1 but having a smart contract execution failure in Shard2.
Also, if the transaction execution in Shard2 is reverted, how should we revert the transaction execution in Shard1? Yes, a solution like using a proof of reverted for the transaction execution in Shard 2 might work. However, if the transaction is out of gas, no revert transaction execution in Shard1 would be taken into processing.
This problem becomes very complicated if there is a chain of dependent smart contract calls. Imagine that the transaction execution in Shard2 would call another smart contract in Shard3, which in turn calls another smart contract in Shard4…
To the best of my knowledge, there is no existing solution for this problem yet. Even though there is some workaround solution like in Quarkchain and TomoChain — by having all dependent smart contracts in a chain of smart contract calls deployed on the same shard to avoid any cross-shard smart contract calls. However, this solution might bring the system to a state where many smart contracts are deployed on the same shard because these smart contracts depend on the same library.
Ethereum proposes to have Load Balancing in its roadmap to Sharding. Basically, Load Balancing here means to redistribute the loads of shards from overloaded shards to underperforming shards. Unfortunately, Ethereum is still struggling with the implementation of its first phase of Sharding and Proof-of-Stake. What’s worse, it is very difficult to move a smart contract from one shard to another shard. This is the problem that has no solution thus far.
The second problem for state sharding is data availability while reshuffling/shard transition is being taken into account. In Ethereum, TomoChain, and Zilliqa, reshuffling happens periodically, called epoch. At the end of each epoch, nodes need to be reshuffled from one shard to another shard to provide system resiliency if one shard is colluded. Simply stated, while transitioning, the transitioned node needs to fetch the chain shard state of the target shard, which might take a longer duration than the epoch time if the shard state is large in size, thus resulting in a useless system. This problem can be solved by requiring all nodes to store the state of all shards, which costs the nodes much, much higher. Imagine if there are 1000 shards, all nodes would have to store the data of all of the 1000 shards plus the chain data of the root/main chain (well, the root-shard chain architecture seems to be the standard architecture for sharding in blockchains with smart contracts). The more it costs to run a node, the less decentralized a system can achieve because this would mean only a few nodes are able to afford the running cost.
TomoChain’s sharding implementation status
TomoChain has been following the state sharding direction with some customizations (please refer here for TomoChain’s sharding). Initial implementation has been developed internally. However, since the epoch time in TomoChain is quite short (around 30 minutes), this may create issues once sharding is enabled on Mainnet. We have also been actively re-investigating the transaction fee in cross-shard transactions as previously described to rigorously solve this problem.
On the other hand, we are currently focusing on growing our ecosystem around the TomoChain blockchain by building enterprise-oriented applications, decentralized cryptocurrency exchange protocol TomoX, and blockchain games on TomoChain. We believe these are the core applications and the right path towards the mass adoption of blockchain in daily activities. It is meaningless if we build a highway but only a small group of people using it.
Due to reasons related to Sharding complexity and the ecosystem growth, TomoChain has decided to postpone the implementation and deployment of Sharding on Mainnet. We will continue our deep research on Sharding, and focus on growing the utilization of the TomoChain blockchain ecosystem. Once user transaction volumes reach the current capacity of TomoChain, we will enable sharding on TomoChain Mainnet.