A Journey Through Phase 2 of Ethereum 2.0

Will Villanueva
May 15 · 19 min read

Vitalik Buterin recently put together his first public proposal around Ethereum 2.0, phase 2 and followed up with additional abstractions. There are lots of moving pieces, and I wanted to take the time to summarize what is being proposed in a higher-level manner and provide the opportunity for others to take a breadth-first approach to understanding the new thought space.

First of all, contrary to popular belief, phases 0 to 2 do not need to be worked on in complete sequence. Large portions of each phase can be worked on in parallel.

In general, this article will assume some prior knowledge of the Ethereum 2.0 landscape. As a launching off point, I suggest you dive into some of these materials:

https://docs.ethhub.io/ethereum-roadmap/ethereum-2.0/eth-2.0-phases/ https://media.consensys.net/exploring-the-ethereum-2-0-design-goals-fd2d901b4c01 https://media.consensys.net/state-of-ethereum-protocol-2-the-beacon-chain-c6b6a9a69129 https://github.com/ethereum/wiki/wiki/Sharding-roadmap https://www.youtube.com/watch?v=S262StTwkmo

Ethereum 2.0 development has been split into 3 phases for its initial launch. Phase 0 focuses on the beacon chain and core pieces needed to build the foundation of the other phases such as networking, signature schemes, randomness, etc. Phase 1 focuses on the mechanics of the independently operating 1024 shard chains. In phase 2, the focus is around the execution engine, transaction thought space, account model and more. In essence, it is what brings Ethereum 2.0 to life and opens up state execution and computation.

In Vitalik Buterin’s first proposal — which saw influence from Casey Detrio’s recent post — he suggests a different approach than what people may have originally assumed. His approach builds a light layer 1 protocol with a heavier focus on layer 2 within the shard chains. This layer 2 does not actually represent plasma or state channels, but instead focuses on shard chains operating in a more generalized/open format. This approach is a general paradigm shift and may take a bit to really grasp and digest. However, its strength lies in the fact that it provides a high degree of flexibility. It should make it simpler to introduce changes in the future as research continues. This contrasts Ethereum 1.0’s approach, which locks the whole system into one model. As a consequence, it requires major protocol-wide updates to make changes to the system. In the conclusion of this article, after a more foundational understanding is introduced, we’ll dive into some of the pros & cons of this new system.

To describe all the moving pieces, I’m going to walk you through the journey of how you may first move your funds into a shard chain and begin transacting. :) Lets begin.

Moving 1.0 ether into the 2.0 chain

In order to move your ether from the old Ethereum 1.0 chain and into the Ethereum 2.0 chain, you’ll need to burn your ether by depositing it into a contract on the old chain. The new chain has a protocol and voting period to recognize this deposit and surface your ether into the beacon chain. Other than deposit into the contract on the old chain, you do not need to do anything else. The current protocol’s voting/inclusion system will bring your funds into the beacon chain automatically. Over time, this system may be phased out for another one as the legacy chain becomes less active. However, discussions are still happening on what this transition could look like.

In existing specifications, you will automatically become a validator/staker if you deposited at least 32 eth. However, most people will likely want to bring their ether into a shard chain so they can use it within a number of applications, contracts, or wallets. In Vitalik’s proposal, you have the ability to do this via a number of steps. However, it is important to first understand that there is no concept of native ether in any of the shard chains . This idea is where the construct of layer 2 within the shard chains begins to surface. Before diving deeper into the steps of how you would bring your ether into a shard chain, lets cover a couple pieces of background knowledge.

Beacon Chain Contracts

The beacon chain now stores smart contracts (recently called Threads) known as beacon chain contracts. These contracts are not analogous to regular smart contracts you would deploy for your application on Ethereum 1.0 (or smart contracts such as multisig wallets that you would set up to represent your account in the eth2 account abstraction scheme). Those would live within the shard chains. In contrast, beacon chain contracts will represent execution environments or transaction frameworks as a whole. For example, you could have a different beacon chain contract representing a different implementation or framework around Ethereum. One may represent an account-based Ethereum transaction/storage model, while another could represent a UTXO-based Ethereum model. In practice, there should not be a plethora of beacon chain contracts. There should only be a few — especially at first.

One caution as you are building your mental model of this new system. In the explanation above, you may assume each beacon chain contract would represent a different virtual machine. This assumption is not true. Instead, the contracts define/enforce state and pure functions similar to precompiles on current Ethereum 1.0. This model should make more sense as we work our way through this article.

In many current discussions, executor and execution environment may also be used to refer to beacon chain contracts as a whole.

Buying into a framework

Now that we have a little background, lets continue our journey. We’ve decided we want to bring our eth into the accepted account-based Ethereum model. Lets say this model is defined within beacon chain contract 0 (or lives in the 0th index of the list of contracts). Also, we’ve already brought our ether from eth 1.0 into the beacon chain. It lives in a validator/staking account, but we are choosing to not become an active validator. Instead, we want our ether to surface into a particular shard. In this case, lets say we’re interested in buying into shard 5 (my favorite cryptokitties application lives there and I’ve been dying to play :)).

In order to bring our ether into shard 5, we need to transfer our beacon chain eth into beacon chain contract 0. Thankfully, a new transaction type is introduced into the beacon chain called withdrawToContract. We can add additional data into this call. In our case, we would include the address within the shard we are sending our eth into in addition to the shard number, 5. The data we need to send would be defined within the beacon chain contract.

{
    # Your account in the beacon chain holding the eth
    "validator_index": uint64,
    "data": {
        shard: 5,
        address: address
    },
    "pubkey": BLSPubkey,
    "signature": BLSSignature
}

In reality, this data gets issued into a receipt within the beacon chain and included in the beacon chain block data by the block proposer. Basically, the beacon chain stores a list of receipts in every block.

Now, our ether is locked up and we are able to use it within shard 5. Before diving into the process of claiming this ether, lets dive into just a little more background info. We need to understand the connection between the shard chain and beacon chain contract just a little more.

Shard Chain Contract and State Execution

As discussed above, shard chains do not have a concept of native ether. Instead, you lock your ether into a beacon chain contract and then use it on the shard chain. To understand this process, it’s good to visualize and understand the connection between the shard chain and beacon chain contracts. In essence, the shard chains and their state execution functions will be a reflection and integration of the framework defined in the beacon chain contracts.

Within each shard chain’s block, a global state is generated. State within a shard chain always maps directly to a beacon chain contract. Lets start off with the example Vitalik uses:

{
    # What we think of as the actual "state"
    "objects": [[StateObject, 2**256], 2**64],
    # Receipts
    "receipts": [Receipt],
    "next_receipt_index": uint64,
    # Current slot
    "slot": uint64,
    ...
}

Each index in the objects list maps to each beacon chain contract index. If there are two beacon chain contracts, then objects will only have two entries. If account-based Ethereum is beacon chain contract 0 and UTXO based is contract 1, then index 0 and 1 would respectively reflect the state for each. Slot is just another name for the block number on the shard. Within each index, we have [StateObject, 2**256] which is just a key-value storage with 2²⁵⁶ (256 bit) key options. Each StateObject would contain the following fields:

{
    # Version number for future compatibility
    "version": uint64,
    # Contents
    "storage": bytes,
    # StateObject can be removed if it expires (ie. now > ttl)
    "ttl": uint64
}

We’ll chat about the ttl later, but it represents state expiration and fits into the larger discussion around rent. Storage is just an arbitrary byte array and will be structured/defined by the framework laid out within the beacon chain contract. However, recent research is pointing to a new approach and we may not even need nodes to store state at all, which could mean no need for a strict ttl defined at the protocol layer. More on this in a bit… For now, lets move our ether!

Moving Ether into the Shard Chain

The beacon chain contract has a function called depositToShard.

def depositToShard(state: BeaconState,
                   receipt: WithdrawalReceipt,
                   proof: WithdrawalReceiptRootProof):
    # Verify Merkle proof of the withdrawal receipt
    assert verify_withdrawal_receipt_root_proof(
        get_recent_beacon_state_root(proof.root_slot),
        receipt,
        proof
    )
    # Interpret receipt data as an object in our own format
    receipt_data = deserialize(receipt.withdrawal.data, FormattedReceiptData)
    # Check that this function is being executed on the right shard
    assert receipt_data.shard_id == getShard()
    # Check that the account does not exist yet
    assert getStorageValue(hash(receipt_data.pubkey)) == b''
    # Set its storage
    setStorage(hash(receipt_data.pubkey), serialize(EthAccount(
        pubkey=receipt_data.pubkey,
        nonce=0,
        value=receipt.amount
    )))

The comments should provide the background data and no need to go too deep here. Essentially, we are just submitting the original withdraw receipt that was printed on the beacon chain. This function gets executed within the shard and does the Merkle proof to make sure the receipt is valid. If it is valid, you’re set and new storage is written with your account data:

EthAccount {
    pubkey: BLSPubkey,
    nonce: 0,
    value: 32eth
}

Transferring funds

Now that we have an account on the shard chain, we should be able to transfer some of our funds to another account. This is simple and only a matter of adding another function to the beacon chain contract:

def transfer(sender: bytes32,
             nonce: uint64,
             target: bytes32,
             amount: uint64,
             signature: BLSSignature):
    sender_account = deserialize(getStorageValue(sender), EthAccount)
    target_account = deserialize(getStorageValue(target), EthAccount)
    assert nonce == sender_account.nonce
    assert sender_account.value >= amount
    assert bls_verify(
        pubkey=sender_account.pubkey,
        message_hash=hash(nonce, target, amount),
        signature=signature
    )
    setStorage(sender, EthAccount(
        pubkey=sender_account.pubkey,
        nonce=sender_account.nonce + 1,
        value=sender_account.value - amount
    ))
    setStorage(target, EthAccount(
        pubkey=target_account.pubkey,
        nonce=target_account.nonce,
        value=target_account.value + amount
    ))

Again, no need to go too deep here. This function just reduces the balance in our account and increases the balance in the receiving account. Remember, these functions are executed within the shard’s consensus layer. A BLSSignature schema is used here to keep consistent with the current work on the beacon chain. However, other signature schemas could be used here. In reality, there should be extra code here to create a new account if the storage does not exist for the receiving account. Our diagram gets a nice little update.

Lets do a recap on everything so far :)

Recap on current journey so far

So far, we’ve taken our eth and moved it into the Ethereum 2.0 beacon chain from the Ethereum 1.0 chain. We also locked the eth into a beacon chain contract which allowed us to use it on shard 5. Finally, we transferred some of our eth to another account on that shard. So far, we’ve started simple. The current framework only represents a basic account model with a balance. We could definitely make it a bit more complex and begin including smart contracts and sophisticated state executions, but lets hold off for a bit. I want to add just a little more of a foundation, then we’ll begin talking about a more sophisticated transaction model.

Fee Markets

In the current journey, we’ve stated that shard chains do not have any concept of native ether. In reality, your ether will surface into a shard chain differently based on the framework or beacon chain contract you buy into. This may bring a couple questions to mind. For example, if there is no native currency, how does a block producer get paid? In our example, this framework pegs a 1:1 with beacon chain ether. Therefore, the block producer probably won’t mind accepting the ether or currencies defined in a beacon chain contract. However, this brings forward even more questions.

Does this mean a block producer needs to buy into every execution environment or framework that has been established? Would the block producer need to establish verification, conversions and security analysis on every beacon chain contract framework to ensure a transaction would be worth it? This may get really burdensome for the block producer and is not entirely efficient.

This interesting thread as well as this followup from Vitalik dive deeply into this topic. As a general summary, there is a bit of a paradigm shift introduced here. Instead of nodes managing a network of mempools, the responsibility shifts to a number of relayers (the first linked post calls them operators).

If you would like a transaction included, you would broadcast your transaction to a network of relayers — not unlike the current process of broadcasting to nodes operating mempools. Those relayers would take responsibility for organizing, ordering and validating the transactions they receive. They would likely organize the most lucrative transactions together as that would include the highest amount of fees for themselves. After organizing a set of transactions into a tentative block, the relayer will estimate its gas payout. In this case, the relayer will offer a flat fee that it is willing to pay to the block producer if the producer includes its set of transactions. If included, the block producer can collect the fee directly from the operator. There are a number of implementation details to consider that we won’t get into. For example, there may need to be a staking mechanism to make sure the relayer pays the block producer. Also, a function may be added to the main Ethereum beacon chain contract or to a shard chain contract in order to repay the block producer. We won’t worry about that for now and instead simplify the basic process through an example.

Lets assume a basic Transaction from a user is as simple as follows under the beacon chain contract 0:

{
    'to': address,
    'from': address,
    'amount': uint64,
    'gas_price': uint64,
    'signature': bytes
}

Lets assume the relayer submits a BlockProposal (their list of transactions they have organized):

{
    'transactions': [Transaction],
    'signature': bytes,
    'fee': uint64
}

In turn, we would include an additional function on the beacon chain contract:

def process_block(block: BlockProposal):
    assert verify_signature(block, block.signature)
    for transaction in transactions:
        process_transfer(transaction)

Where process_transfer is as follows:

def proccess_transfer(tx):
    assert verify_signature(tx, tx.signature)
    
    to = get_storage(tx.to)
    to.balance += tx.amount    gas_fee = TRANSFER_GAS * tx.gas_price
    from = get_storage(tx.from)
    from.balance -= (tx.amount + gas_fee)
    
    relayer = get_storage(get_relayer())
    relayer.balance += gas_fee
    
    set_storage(tx.to, to)
    set_storage(tx.from, from)
    set_storage(get_relayer(), relayer)

Since the actual account abstraction model around fees and gas payments is still evolving, we keep this simple and express a base fee, TRANSFER_GAS.

Soon, we will talk a little further on how this model can be extended to deal with actual code execution and account abstraction. Also, a small detail was skipped above — I did not describe how the process_transfer function was properly executed (the process of running additional wasm code within the contract). We will talk about that in a moment, but first lets add one more layer of complexity. We’re going to talk about how we can completely remove the concept of state from shard chains.

Shard Chains Don’t Need State

To get a deeper dive into this concept, take a look at Vitalik’s proposal here. A couple ideas are introduced in his writeup:

  1. State or the objects field we described earlier do not need to be maintained within a shard node. The concept of it may still exist, but it exists at the application layer and not at the consensus layer; nodes that are not interested in “plugging in” to this specific execution environment need not be aware of it.
  2. Checkins or crosslinks on the beacon chain get compressed state checkins (ie Merkle hash).

This concept could actually nullify the original discussion around storage ttl, poking and expiration. Since no state is stored, there’s likely no need to make the protocol consider expiration. However, the concept could still be included in order to reduce storage requirements on the relayer network.

I don’t want to dive too deeply into the concept of stateless clients since it could occupy an entire blog post in and of itself. However, I’ll do my best to give a really brief summary. In the examples above, you’ll notice we used a function get_storage. This function would likely map to a runtime function (EEI) which connects the ewasm or web assembly environment to a value stored within a running node’s local database. This would operate in line with the EVM opcode SLOAD.

A stateless system suggests that you don’t really need to maintain storage or state in a node. Instead, you can just include witness data in the transaction that includes all the storage the functions need to access. Essentially, the transaction submits its own database and includes a proof for it. For example, lets assume we have the following as the storage for the execution environment linked to our beacon chain contract 0:

[
    ...,
    EthAccount{
        nonce: 3,
        value: 1232,
        pub_key: BLSPubkey 
    },
    EthAccount{
        nonce: 12,
        value: 22,
        pub_key: BLSPubkey
    },
    ...
]

We can assume there are a plethora of entries and we can merkelize this list of storage. As a result, we will be able to generate a Merkle root hash from the data. Instead of storing the entire state in each shard block, we can instead just store the Merkle root. Instead of just including a signature with my transaction, I would include witness_data. Witness data would include a combination of my signature and the Merkle branches needed to prove the current state of my account and the receiving account in a transfer. I just need to provide Merkle branches for the state values the transaction accesses. In a transfer, that would just be my account state and the receiving account state. In this case, nodes no longer need to keep a big database of the current active storage. Instead, they can use the witness data in every transaction as a database. Stateless clients are fascinating and I encourage you dive into this writeup to understand more.

You may ask, if the nodes no longer keep track of storage, will users need to keep track of it themselves? Maybe they keep it in local storage? What happens if they lose the data and no longer can provide witness_data? Does this mean they lose access to their funds and account?

These are excellent questions and the answer is pretty cool. The relayers who submit the proposed blocks to the block producer would actually be incentivized to keep this storage. Other types of nodes could also exist as separate third parties as an alternative. Also, users can store their state locally and only request third party or relayer help if they lose the storage. The third party market or relayers can charge appropriate fees in order to provide this service. In essence, storage and storage fees can be entirely removed from the core protocol. All the core protocol needs is a merkle hash or other compression value. The witness format can be defined per execution environment or per beacon chain contract. We continue to have flexibility.

With the new stateless approach, there is another benefit. We can check these state roots into the beacon chain crosslinks. This brings a huge benefit. Lets elaborate just a little bit.

Beacon Chain Crosslinks

Every 6 minutes, the current block hash of each shard chain is checked into the beacon chain. This data checkin is known as a Crosslink. This crosslink establishes finality. Essentially, when there is communication between separate shards or the beacon chain needs to verify a receipt on a particular shard, it can wait for finality to be established by waiting for a Crosslink to show up on the beacon chain. At that point, a Merkle proof can always be generated to show that any receipt or transaction on that shard did actually occur. We will chat a little about the details surrounding cross shard communication in a bit.

In the new stateless model, we can make current implementations in phase 0 and phase 1 much more efficient. We no longer need to have separate shuffling of committees.

To explain a bit deeper, the original phase 0 spec created a persistent committee and an epoch committee of attesters. As a validator or staker, you would have two jobs. You would validate slots within the beacon chain as part of an epoch committee, and you would validate slots within a shard chain as a part of a persistent committee. The persistent committee would manage attestations, votes and validation on the shard chains for an extended period of time (~1–2 weeks) until shuffling to a different shard. The epoch committee, on the other hand, would vote for crosslinks and finality on a particular shard for only one slot in an epoch. This committee would be shuffled on a per-epoch basis (~6 minutes) and would validate a separate shard chain crosslink after each shuffle.

Now that we do not need to maintain storage on the shard chains, we can merge these two committees. Originally, they were separate because validating a new shard after the persistent period could take days/hours as a full node. In the stateless client model, this time is reduced significantly. We can now shuffle every hour or two. This gives us a few additional gains:

  1. Reduced shuffling period on the shard chain gives us better security and simplicity (no need to spend days to sync a full node and stakers have less time to collude)
  2. Longer shuffling period on crosslink/beacon chain committees gives us more network stability and reduces the load on syncing a new light client every 6 minutes

Cross Shard Transactions

Cross shard transactions are largely out of the scope of this blog post. We are writing a number of POCs right now and contributing in active research within this area. Expect to receive write-ups and further blog posts on this topic soon. However, the following are good write-ups on the current discussion:

https://ethresear.ch/t/fast-cross-shard-transfers-via-optimistic-receipt-roots/5337 https://ethresear.ch/t/phase-2-pre-spec-cross-shard-mechanics/4970

One thing to note is that you do not need to actually wait the epoch period (~6 minutes) in most cases and can optimistically combine transactions on multiple shards.

Expanding to Full State Execution

If you’ve made it this far, congratulations! There has been a lot of information to absorb. At this point, you really should just dive deeper into Vitalik’s original proposal (which gives an overview of full state execution) and followup discussion. If you are trying to engage in a lighter read, you may want to skip forward to the conclusion which covers general pros and cons of this proposal. However, I’ll give a brief explanation.

The execution environment in Ethereum 2.0 will run on ewasm. This is a subset of web assembly intended to be run with the node runtime functions that can map to specific op_codes. The web assembly operations and op_codes will all be metered and the execution engine will calculate gas use for blocks of code that are executed. To dive deeper into the mechanics of how gas mechanics may work, the term account abstraction will guide you in the right direction. Additionally, each execution environment or beacon chain contract can build its own account abstraction gas mechanics.

Lets start with a general idea of how this may look. :)

EthAccount would likely add an additional field, code:

EthAccount {
    pubkey: BLSPubkey,
    nonce: 0,
    value: uint64,
    code: bytes
}

This would alter it so you wouldn’t actually distinguish between contract accounts and externally owned accounts (EOA). In current Ethereum 1.0, the storage is different between an account managed by your wallet such as MetaMask and a deployed contract. Check here to learn more.

Next, you would definitely need to update the process_block function we described earlier. It would likely need a series of wrapping functions to establish a proper calling environment. For example, tx.sender, tx.executor and more would need to be set. Also, you would define the account abstraction, gas limit rules and more. In the proposal, a set of EEI functions are included that can be added to an execution environment or beacon chain contract. In process_block and the set of wrapping functions, you would use:

executeCode(code: bytes, data: bytes) -> bytes

The wrapping functions would likely abstract and load the code automatically from the appropriate EthAccount.

Double Spending Protection via Bitfields

Part of the phase 2 proposal adds a receipts list to both the beacon chain and shard chains. These receipts are extremely important. For example, the depositToShard function we called earlier utilizes a receipt from the beacon chain. Additionally, shard chains also store receipts in their blocks. The shard chain receipts are used for a few purposes:

  1. Verification in cross shard transactions via merkle proof

Sending funds into a receiving shard requires burning the funds on the source shard. The receiving shard needs to run a proof to make sure the funds were in fact burnt.

2. Verification in sending funds back into the beacon chain

The same mechanism as above applies in addition to the mechanism of originally claiming funds via depositToShard.

The structure of aReceipt on a shard chain is as follows:

{
    # Unique nonce
    "receipt_index": uint64,
    # Execution Script (beacon chain contract index) that the receipt is created by
    "executor": uint64,
    # Address that it is intended for
    "target": bytes32,
    # Data
    "data": bytes
}

One major concern in these methods is the process of not using a receipt twice — especially in the stateless model. Since you cannot include an exclusion proof on every block (it’s just impossible), there has to be another way. The proposal discusses a check_and_set_bitfield function to keep track of used receipts. The function arguments are as follows:

check_and_set_bitfield_bit(bitfield_id: uint64, bit: uint64):

The id would map to a receipt index. Each shard chain and the beacon chain increments a receipt index on every published receipt. The bit argument would map to a secondary identifier. Beacon chain receipts may be tracked by bit = 0 and each shard chain would be tracked by bit = SHARD_COUNT + 1. Having the secondary identifier is important since each shard chain and the beacon chain will have collisions on the receipt index. Calling the function would set the bit within the bitfield chunk to 1. If it is already 1, it would assert an error. This would mean the receipt has already been consumed and a double spend attempt is in process.

Conclusion

This is definitely a dense article. I hope it helps in your journey of understanding the current and evolving specification around Ethereum 2.0, phase 2. I’m also hoping it helps as a secondary resource if you are wanting to read through current proposals.

In closing, I’ll list the pros and cons of this approach vs. having the rules set strictly at the layer 1 protocol level. Thanks to Vitalik for summarizing these:

Pros

  1. Less risk of consensus forks
  2. Faster time to deployments (updates to execution/beacon contracts vs. having to update the core protocol and client code)
  3. Easier path to upgrade the environment in the future without hard fork governance politics
  4. Less code that needs to be written/repeated in different client implementations
  5. Ability to test different approaches in parallel on the same base layer

Cons

  1. The execution environments/beacon chain contracts would need to be audited meticulously to ensure there are no bugs/issues
  2. Less hard fork politics, but more standardization politics
  3. Risk that the total complexity of the consensus layer, layer 2, relay networks and secondary environments is greater

My thanks to Matt Garnett for his involvement in writing this article. Also, thanks to John Adler for helping in the review and editing process.

We’re currently hiring for an additional Ethereum 2.0 phase 2 researcher. If the work and research presented in this article interests you, please reach out to me. Matt Garnett and I have started a research team and effort fully focused on eth 2.0, phase 2 called Quilt. We are funded by Consensys R&D.

Also, follow me on Twitter :)

Will Villanueva

Written by

@wjvill Currently: R&D on Phase 2 for Ethereum 2.0 Past: CTO, Cofounder & original #buidler at @ethBounties hello@willvillanueva.com @consensys