This post aims to summarize the research efforts around phase 2 of Ethereum 2.0, with specific focus on the relay network and fee market challenge. A number of proposals, each with varying trade-offs, have been made across various platforms, so this should serve as a reasonably comprehensive compilation to allow new researchers to get up to speed.
Before diving into relay networks, let’s first take a detour to discuss one of the primary bottlenecks (if not the primary bottleneck) on Ethereum since its inception: state.
Ethereum’s state is the set of account balances, contract code, and contract storage. (In Bitcoin, state is the UTXO set.) Whenever transactions are executed, the state — a ~45 GB data set in uncompressed form — must be both read from and written to. While there were earlier concerns around the fundamental unscalable nature of Ethereum’s computational model, it turns out that these state accesses are by far the most expensive part of transaction execution, and indeed disk I/O is the limiting factor for running an Ethereum full node.
Note that, in order to guarantee being able to execute an arbitrary transaction in this stateful paradigm, a full node needs to keep the entire state in an accessible location (such as in-memory, or on swap space). While this is borderline acceptable in the single-blockchain case, it would not work with sharding with shuffled committees. If each time a validator was assigned to a new shard, they had to sync the entire state of that shard, the system would be equivalent to just having a single chain with blocks of size
shard_block_size * shard_count. Enter: stateless clients.
Sharding and the Stateless Client
The stateless client paradigm (originally called “storageless”) vastly reduces the burden of executing transaction by enabling validators to only store a compressed representation of a chain’s state, commonly referred to as an accumulator. Typically accumulators are constant-sized (e.g., a Merkle root of the state), though they may also be logarithmically-sized.
The essence of the stateless client model is that each transaction includes a witness against the current accumulator that contains all the required information to execute that transaction. In the case of Ethereum, with a Merkle accumulator like a Sparse Merkle Tree (SMT), this would be Merkle branches to the state elements touched by the transaction during its execution.
Among other applications, this process can be used to greatly speed up initial sync times of Ethereum nodes, as seen with Beam Sync.
Side note: cryptographic accumulators, such as RSA accumulators, have better properties than Merkle accumulators, but rely on either a trusted setup or cutting-edge, un-audited cryptography. As such, it will take a while for them to be safely usable in production systems.
There is one challenge with stateless clients however: when using Merkle accumulators, witnesses become outdated after each atomic bundle of execution. If the execution model is that each transaction includes a witness and that transactions are executed in order, then the original witness of each subsequent transaction would become outdated and need to be updated. Fortunately, updating witnesses for Merkle accumulators involves zero hashing overhead. If the execution model is that witnesses are attached to a “package” of transactions, then individual witnesses don’t need to be updated, but rather, merged into a multiproof.
If we assume that all users run full nodes, and can keep their witnesses to state elements up-to-date, then the system mostly works out-of-the-box. However, in a sharded context this may be an unreasonable assumption, as this would require users to fully validate each shard their have state in; as mentioned above, this is exactly what we don’t want, as it would be equivalent to just having a single chain with a larger block size.
To alleviate this issue, relayers (also referred to as state providers) are introduced. Relayers are responsible for providing witnesses to users, and are paid for this service. Unlike users, they can limit their activities to a single shard trivially. As it would be unreasonable for relayers to know in advance what state elements their should keep witnesses for up-to-date, they must instead store state (hence, state provider), so as to be able to provide witnesses on the fly.
Introducing relayers, however, also introduces a whole host of other complex challenges, and remains one of the more significant open research questions for a full deployment of Eth 2.0. First, if not balanced properly, block proposers (i.e., validators) will be incentivized to be relayers, with associated costs potentially being in large part inaccessible to the average user. Additionally, a way of collecting appropriate fee payments from both users and relayers (i.e., the fee market) without also introducing significant overhead isn’t immediately apparent. In other words, while validating stateless transactions is easy, producing the necessary witnesses is non-trivial.
Relay Network and Fee Market Proposals
The phase 2 design space is large and has seen a number of different proposals for handling execution. Each proposal also brings with it a potentially different way of paying fees, along with a different mechanism for transactions to be relayed from user to block proposer. This section will attempt to provide a concise but nonetheless comprehensive summary of the different proposals and analyze the behavior of the relay network and fee market for each.
Early proposals that could lead to phase 2 saw the separation of shard chain block production into three distinct set of actors: proposers, collators, and executors. Proposers would be responsible for collecting transactions into blocks (collations), proposers would be responsible for proposing a particular collation extend the canonical [shard] chain, and finally executors would commit to a new state root based on the proposed collation.
This proposal was never fully fleshed out; as such, research into fee payment mechanisms was not conducted. It was retired as, among other things, the incentives of the system encouraged actors to be proposers, collators, and executors at the same time in order to maximize their rewards.
Phase 1 and Done
The turn of the year 2018 saw the rise of using scalable blockchains as data availability layers, with a base layer providing high-throughput global data availability for second-layer execution engines, such as zk-rollups (and later optimistic rollups). Unfortunately, initial suggestions that Phase 1 (sharding, with data blobs sans any execution) be useful as a data availability layer for e.g., Eth 1.x while waiting for Phase 2 to be deployed, turned out to fall short; a separate execution layer can’t actually access the data (giving Eth 1.x oracle access to all shard data would be equivalent to it having very large data-only extension blocks), and bridging it over would be no different than simply including it in the execution layer in the first place!
This was highlighted by Casey Detrio in his seminal post Phase One and Done: eth2 as a data availability engine. The post notes that shards in phase 1 can be used for bridged data availability without state sharding, by adding only minimal execution sharding. In this scheme, shards can 1) authenticate data (i.e., insert a list of transactions into an authenticated data structure such as a Merkle tree) and 2) perform simple stateless execution, such as verifying a zero-knowledge proof against authenticated data.
Defining the programs that would be used to process the data blobs is done by uploading contracts (which would later become “precompiles” and then “execution environments”) to the beacon chain. Contracts would presumably be client-interpretable bytecode, the exact choice of which is orthogonal to the whole of the proposal.
This “phase one and done” mostly sidestepped the fee payment challenge, as it was a challenge with previous proposals and remained a challenge for subsequent proposals, instead choosing to focus on laying out the minimal changes needed to make Eth 2.0 sustainably useful in as short a timeframe as possible. There was one suggestion in passing to pay fees using a contract on Eth 1.x (i.e., have the rollup contract pay shard block proposers on Eth 1.x for including a data blob on Eth 2.0 with bridged processed data availability). While certainly doable, this would inextricably tie Eth 2.0’s usability to Eth 1.x’s survival in perpetuity.
Phase 2 Proposal 1
Following Detrio’s “phase one and done” proposal, Vitalik Buterin revealed a new concrete proposal for phase 2 in his phase 2 proposal 1. In this proposal, execution scripts (initially called “beacon chain contracts”) live on the beacon chain. Ether never leaves the beacon chain, instead being deposited into execution scripts (in order to pay fees). Each shard has its own completely independent state and execution.
Note that contrary to what their name would suggest, execution scripts define the virtual machine with which to execute transactions, and are not smart contracts as we’ve seen deployed on Ethereum to date. The data model and opcodes must be fully defined by the execution script in some client-interpretable, metered, code (e.g., ewasm).
For the purposes of fee payments, each execution script is essentially a layer-2 system, with their own internal fee-paying mechanisms that shard validators don’t “understand.” This obliviousness is Actually a Feature™, as requiring shard validators to be able to parse and account for a different fee mechanism — and potentially a different currency — would lead to enormous implementation and economic complexity along with exploitation vectors, to the point of being unusable. In large part, the same arguments against economic abstraction hold against execution script-specific fees.
As validators can’t collect fees from within an execution script, they need to be paid externally to it. This is accomplished with a special (i.e., enshrined) execution script, in which anyone may post a transaction to the effect of “if this data blob for this execution script on this shard at this slot is included, I’ll send the includer a payment.” These actors are relayers (originally called “operators”), and are responsible for collecting user-transactions for the non-enshrined execution scripts, each paying them an execution-script-specific fee, and paying shard block proposers (producers) for including this blob through the enshrined execution script.
An excellent summary of phase 2 proposals up to this point was compiled by William Villanueva in his post: A Journey Through Phase 2 of Ethereum 2.0.
Phase 2 Proposal 2
Drawing inspiration by Detrio’s earlier work, Buterin’s second proposal for phase 2 further simplifies the first proposal by removing state from shards and instead using the beacon chain to keep track of shard-specific, execution environment (EE, previously execution script)-specific state roots.
This has the benefit of allowing EEs to specify their own accumulator format (e.g., a Sparse Merkle Tree, a red-black tree, a cryptographic accumulator, etc.), rather than needing to enshrine a specific accumulator format as with previous proposals.
The general transaction flow would now look like this: users craft transactions and send them to relayers. Relayers add witnesses (in the form of Merkle branches, if a Merkle accumulator is used) in exchange for fees from users in an unspecified manner, and package multiple transactions together. Relayers would send these packages to block proposers, paying them an in-protocol fee for including this package in the next [shard] block. This in-protocol fee, as with proposal 1, is paid through an enshrined EE that all validators are required to understand.
While this scheme is certainly a step up in terms of ease of being a validator, as only minimal shard state is required, the lack of an in-protocol fee payment mechanism from users to relayers does not alleviate the relay network issue. Firstly, using an out-of-band payment scheme reduces users privacy. Second, it is not obvious how to create an entirely off-chain system (that does not rely on another, external, blockchain) that guarantees that users receive appropriate witnesses and relayers get paid for providing the required witnesses, atomically. Third, the potentially high friction of out-of-band payments and high computational requirements of being a relayer (as relayers must have all state for the EEs and shards they wish to serve, in order to be able to generate witnesses) can be a centralization concern — with only a few “gatekeeper” relayers between the users and the Eth 2.0 network in the worst case.
Bring Back the Mempool
Rather than having relayers pay block proposers conditionally, which would require non-trivial double-commit-reveals schemes to ensure atomicity, a new proposal emerged by Buterin: one fee market EE to rule them all. In this proposal, EEs themselves would have balances. When executing a package of user-transactions, the EE will output a receipt, which can then later be consumed by an enshrined fee-payment EE to transfer Ether balance from the EE to the block proposer. The fee-paying EE would thus be able to “lookback” and previous receipts to process fee payments. Topping up the balance of the EE would presumably be the responsibility of the relayers (or some otherwise interested party).
This proposal has the benefit of not requiring a complex scheme to negotiate payments between relayers and block proposers. However it still does not address how users interact with relayers, instead opting to suggest payment channels be used.
Building upon the original proposal, Villanueva then suggested bringing back the mempool. Here, relayers take on a reduced role as state providers, being responsible only for providing witnesses but not packaging user-transactions. Block proposers (or, more generally, shard validators) maintain a mempool and merge witnesses to create their own packages. Given that each EE could use a different accumulator format, the EE will need to specify a
merge method that block proposers can use to merge two or more witnesses together, e.g., multi Merkle branches into a Merkle multi-proof.
As state providers only need to provide witnesses (usually in the form of Merkle branches), there is a group of actors that are perfectly suited to provide this service already: light client servers. By encouraging more light client servers through better tooling and reduced costs, this does a long way towards preserving permissionlessness. However, the use of high-friction payment channels for users to pay state providers remained a pain point.
Alternate Phase 2 Architecture
More recently, Buterin published an alternate phase 2 architecture proposal, which makes shard completely stateless. The key different here is the additional of a full, stateful, expressive, state transition engine (e.g., the EVM) in the beacon chain. This engine acts as a “scheduler,” keeping track of EE state roots. The scheduler also allows for multi-shard execution: it can inspect shards and slots in a deterministic order, ensuring that multiple transactions across multiple shards to the same EE are executed correctly (by weaving pre- and post-state roots, ignoring any shard blocks that are invalid or out of order).
The fee payment mechanism for this proposal is largely unchanged compared to the previous one, though a fee-paying EE is not needed as the scheduler has enough expressivity to handle managing EE balances and consuming receipts natively.
Shard Chain Simplification Proposal
At DevCon 5, Buterin unveiled a major refactoring of the Eth 2.0 sharding architecture, drawing inspiration from Near’s Nightshade “shard chunk” design. In this new proposal, rather than each shard chain maintaining its own separate fork choice rule, with faster blocks and crosslinks being formed in a staggered manner, shard blocks are produced in lockstep with beacon chain block, and all shards are cross-linked every beacon block (in the normal case). To accommodate this increased number of shard crosslinks over the previous architecture, the number of shards was decreased to 64 from 1024 and the throughput of each shard was increased somewhat to keep the overall system throughput close to the same.
Fee markets are made simpler by virtue of a fundamental change in mindset: thanks to the vastly reduced shard count, cross-shard communication is simpler, and it’s no longer an issue for users to hold ether in each shard to pay transaction fees directly to block proposers (again through the scheme of EEs outputting receipts). By removing one half of the fee market issue, only the relayer network (i.e., state provider) challenge remains.
A non-exhaustive list of desirable properties of a relayer network/fee market for Eth 2.0 is discussed here.
- Both users, relayers, and block proposers should be appropriately paid/serviced. Users should receive witnesses they pay for, relayers should be paid for providing witnesses, and block proposers should be paid for including transactions in a block. The payment from relayers (or, indirectly, EEs) to block proposers can be handled in-protocol, but the payments from users for both witnesses and block inclusion are not so straightforward.
- The user payment scheme should be low-friction, to allow them to easily switch relayers. Payment channels rooted in an external blockchain, with pre-collateralization and liveness requirements, are anything but. Ideally this should be handled in-protocol.
- Tying in to the previous point, it should be difficult to enforce a blacklist on users. Unfortunately, due to the fact that witnesses are effectively an implicit access list, this may be somewhat unavoidable.
- Validators should not need to “understand” how fees are paid internally to every EE, as that would be exceedingly complex to both implement and build an efficient market around.
- Users should not have to run a full node (e.g., to maintain their witnesses themselves). We want a protocol that is light-client friendly.
- Last but not least, Denial-of-Service (DoS) resistance is important. Depending on how witnesses are merged/refreshed, it may be possible to DoS validators. Care must be taken so as to not allow such an exploit.
Research around a relayer network and fee market system that allows for a good user/light node experience while simultaneously not placing undue burden on validators, all while remaining as permissionless as possible, has gone through a number of design iterations over the past year. Looking forward, we should expect additional phase 2 proposals that further refine and improve this aspect of the Eth 2.0 system.