How to Scale Ethereum: Sharding Explained
UPDATED to reflect the latest research changes for Ethereum 2.0!
The scalability debate is at the front and center of the crypto community. With major occurrences such as the Cryptokitties debacle clogging up the entire Ethereum network over the span of a few days, it is well-known that the biggest, public blockchains in their current state do not scale.
So what are the approaches the community has decided to take? The solution is two-fold. The first approach is to improve scaling through off-chain solutions, also known as layer-2 scaling, where some transactions are handled off the blockchain and only interact with it sparingly. The other approach is to modify the design of the protocol altogether to fix the fundamental problems with parallelizability the blockchain faces. Unfortunately, many of us protocol devs often look at these problems and instantly feel put off by the immense difficulty they pose.
Although we’re still in the early stages of Ethereum, the community is filled with some of the smartest minds in tech, with so many innovations happening at breakneck speed. It’s easy to feel that there are smarter devs out there that are probably way more qualified to tackle monumental problems such as scalability, but this feeling is what’s holding us back. Truth is, the community is willing and ready to help anyone who wants to get involved, and yes that includes YOU! This post will break down the current approach the Ethereum core team is taking towards sharding and expose its current limitations and paths for improvement. By the end of this post, you’ll know enough to explore this problem on your own and who knows, maybe you’ll be the one to build the first sharding client!
As the number of transactions on Ethereum keeps going up and up, we have no time to lose. Let’s get started.
What is Sharding?
Currently, every single node running the Ethereum network has to process every single transaction that goes through the network. This gives the blockchain a high amount of security because of how much validation goes into each block, but at the same time it means that an entire blockchain is only as fast as its individual nodes and not the sum of their parts. Currently, transactions on the EVM are not parallelizable, and every transaction is executed in sequence globally. The scalability problem then has to do with the idea that a blockchain can have at most 2 of these 3 properties:
If we have scalability and security, it would mean that our blockchain is centralized and that would allow it to have a faster throughput. Right not, Ethereum is decentralized and secure.
How can we break this trilemma to include scalability in the current model? Well can’t we just increase the block size, or in Ethereum’s case, the GAS_LIMIT, to increase throughput? While in theory this can be a right approach, the more we keep increasing it, the more mining will be centralized around nodes running on supercomputers that would bring a higher barrier to entry into the system.
A smarter approach is the idea of blockchain sharding, where we split the entire state of the network into a bunch of partitions called shards that contain their own independent piece of state and transaction history. In this system, certain nodes would process transactions only for certain shards, allowing the throughput of transactions processed in total across all shards to be much higher than having a single shard do all the work as the mainchain does now.
Before we dive into how a sharded blockchain actually works, let’s go over some important vocabulary:
- State: the entire set of information that describes a system at any point in time. In Ethereum, this is the current account set containing current balances, smart contract code, and nonces at a given moment. Each transaction alters this state into an entirely new state.
- Transaction: an operation issued by a user that changes the state of a system
- Merkle Tree: a data structure that can store a large amount of data via cryptographic hashes. Merkle trees make it easy to check whether a piece of data is part of the structure in a very short amount of time and computational effort.
- Receipt: a side-effect of a transaction that is not stored in the state of the system, but is kept in a Merkle tree so that its existence can be easily verified to a node. Smart contracts logs in Ethereum are kept as receipts in Merkle Trees, for example.
With this in mind, let’s take a look at how Ethereum 2.0 would work. We will create a sidechain known as a random beacon chain that stores hashes to main chain blocks in its own blocks. This sidechain will be a full Proof of Stake system implementing Casper FFG and will provide a source of distributed randomness that will allow us to build a sharding system on top of it.
The problems with sharded blockchains become more apparent once we consider that possible attacks on the network. A major problem is the idea of a Single-Shard Takeover Attack, where an attacker takes over the majority of collators in a single shard to create a malicious shard that can submit invalid collations. How do we solve this problem?
The Ethereum Wiki’s Sharding FAQ suggests random sampling of validators on each shard. The goal is so these validators will not know which shard they will get in advance. Every shard will get assigned a bunch of validators and the ones that will actually be validating will be randomly sampled from that set.
To begin, we will deploy a contract on the main chain called the Validator Registration Contract, where people will burn 32ETH in exchange for becoming a validator in this sidechain. The beacon chain will periodically check for registered validators and consequently queue up those that have burned ETH into the contract. This beacon chain will serve as a coordination device for a sharding system, as it will allow for distributed pseudorandomness that will be critical for selecting committees of validators on shards. The source of randomness needs to be common to ensure that this sampling is entirely compulsory and can’t be gamed by the validators in question.
On each shard, we would have nodes called proposers that would be tasked with creating a cross-link on the beacon chain, which is a specific structure that encompasses important information about the shard in question.
These cross-links are like mini-descriptions of the state and the transactions on a certain shard. A typical cross-link would tell us the following information:
- Information about what shard the collation corresponds to (let’s say shard 10)
- Information about the current state of the shard before all transactions are applied
- Information about what the state of the shard will be after all transactions are applied
- Signatures from at least 2/3 of all collators on the shard affirming shard blocks were legitimate
What about if a transaction happens across shards? For example, what if I send money from an address that is in shard 1 to an address in shard 10? One of the most important parts of this system is the ability to communicate across shards, otherwise we have accomplished nothing new.
An initial idea is use the concept of receipts for this system to work.
Raul (Address on Shard 1) Wants to Send 100 ETH to Jim (Address on Shard 10)
- A transaction is sent to Shard 1 that reduces Raul’s balance by 100 ETH and the system waits for the transaction to finalize
- A receipt is then created for the transaction that is not stored in the state but in a Merkle root that can be easily verified
- A transaction is sent to Shard 10 including the Merkle receipt as data. Shard 10 checks if this receipt has not been spent yet
- Shard 10 processes the transaction and increases the balance of Jim by 100 ETH. It then also saves the fact that the receipt from Shard 1 has been spent
- Shard 10 creates a new receipt that can then be used in subsequent transactions
This Sounds So Complex for Solidity Devs and Ethereum Users to Understand! How Will We Educate Them on Sharding?
They don’t need to. Sharding will exist exclusively at the protocol layer and will not be exposed to developers. The Ethereum state system will continue to look as it currently does, but the protocol will have a built-in system that creates shards, balances state across shards, gets rid of shards that are too small, and more. This will all be done behind the scenes, allowing devs to continue their current workflow on Ethereum.
Beyond Scaling: Super-Quadratic Sharding and Incredible Speed Gains
To go above and beyond, it is possible that Ethereum will adopt a super-quadratic sharding scheme (which in simple English means a system built from shards of shards). Such complexity is hard to imagine at this point but the potential for scalability would be massive. Additionally, a super-quadratically-sharded blockchain will offer tremendous benefits to users, decreasing transaction fees to negligible quantities and serving as a more general purpose infrastructure for a variety of new applications.
Resources and Where to Get Started
Ok so now you want to start coding a sharded blockchain! How do you begin? At the most basic level, the proposed initial implementation will not work through a hard fork, but rather a sidechain known as a random beacon chain that will serve as a proof of stake + sharding system.
This beacon chain will manage validators and their sampling from a global validator set and will take responsibility for the global reconciliation of all shard states. Vitalik has outlined a fantastic reference doc for implementing sharding here: https://ethresear.ch/t/convenience-link-to-full-casper-chain-v2-spec/2332/4
To get explore this beacon chain architecture in detail and to learn more about how the system works, check out the following resources:
- Sharding FAQ: https://github.com/ethereum/wiki/wiki/Sharding-FAQ
- Sharding in Go: https://github.com/prysmaticlabs/geth-sharding
- Beacon Chain Research Synopsis: https://docs.google.com/document/d/19KyosgCFdsv_UzVdaApmIr0zh37Va0Tck1cca5uEtss/edit#
Wanna Join My Team?
Are you familiar with the inner workings of the Ethereum protocol? Are you a golang developer? Do you want to work with me and a team of developers building the first sharding client for Ethereum 2.0? Check out Prysmatic Labs, a team funded by the Ethereum Foundation to implement sharding.
Check out our contributing guidelines and our open projects on Github. Each task and issue is grouped into the Phase 1 milestone along with a specific project it belongs to (beacon chain, validator node tasks, etc.).