Fraud proofs and virtual machines

Chris Buckland
11 min readOct 22, 2021

--

Rollups provide blockchain scaling by moving execution into a different domain. Transaction data is put on the main chain, but the transactions are not executed there. Instead they’re executed on the rollup network, and periodically a commitment to the current rollup state is posted back to the main chain. This means that the mainnet nodes are no longer required to execute the transactions, reducing their load.

But how can the main chain be convinced that the committed state of the rollup is correct ? Rollups come in two different categories, which solve this problem in separate ways:

  1. zk-rollups: Commitments are proven correct by an accompanying zero knowledge proof.
  2. Optimistic rollups: Commitments are accepted optimistically, but validators check them and submit a fraud proof if they find a commitment to be incorrect. In the case of optimistic rollups the commitment is also called an assertion, as the state is being asserted and not proven.
Optimistic rollups require validators to check state transitions and submit proofs of fraud if they find it

In this post we’ll focus only on optimistic rollups, and the fraud proofs that keep them secure. We’ll look at how they’re currently employed by two popular optimistic rollups — Arbitrum and Optimism — and what the current fraud proof research directions are.

For more details on how rollups work in general I can recommend this blog post.

Fraud proofs

Current fraud proofs come in two flavours:

  1. Non-interactive
  2. Interactive

Non-interactive fraud proofs allow a party to prove the incorrectness of an assertion without the participation of any other parties. They do this by executing all the state transitions between two asserted commitments to show that the resultant commitment differs from the asserted one. Non-interactive fraud proofs have the benefit of being simple to reason about and design. However, unless using a zk-proof, the transition between the two assertions must be small enough to be executed on-chain which, combined with the current abilities of Ethereum, places severe limitations of the transitions that can be effectively verified by this style of fraud proof.

Interactive fraud proofs require two or more parties to work together to show that an assertion is valid or invalid. In practice current designs involve a defender (the party making the assertion), and a challenger. The challenger requests that the defender bisect their assertion into sub assertions, for which the challenger may then pick the sub assertion they disagree with. They continue by requesting bisection of the sub assertion, and so on, until they reach an assertion that represents a small enough operation to be executed on-chain. Interactive fraud proofs introduce the complexity of cooperation between parties. The incentives involved there make them inherently more complicated to design safely. However, the only limitation regarding execution is that a single operation must be executable on-chain, meaning that transactions and blocks in the rollup are not restricted by any of the L1 limits.

So what fraud proofs do some of the current optimistic rollups employ?

Optimism

Non-interactive fraud proofs are the most appealing place to start, and indeed this is the direction Optimism took for their first version. However, for V2 they’re moving towards using interactive proofs. To see why let’s take a closer look at the limitations of a non-interactive fraud proof.

Non-interactive fraud proofs need to execute the full state transition between two assertions. Optimism chose the transaction level as the granularity for the state transition, hoping to enable users to be able to execute normal Ethereum transactions on their rollup.

However, re-executing an Ethereum transaction that occurred on the rollup back on the Mainnet Ethereum isn’t natively supported by Ethereum. This is because transactions have access to the underlying state and accounts of the chain, which differ between the rollup and mainnet.

In order for Optimism to re-execute their transactions on mainnet they needed to replace opcodes that access state and accounts with synthetic ones which make calls to a specific contract pre-populated with relevant state. A total number of 18 opcodes have been replaced in this fashion, with a further 6 being removed altogether. Bytecode with these opcodes replaced by function calls is said to run inside the Optimistic Virtual Machine (OVM). Optimism forked the Solidity compiler to allow developers to compile their contracts for the OVM.

Lifetime of a contract in Optimism

Contracts translated from EVM to OVM bytecode suffer from two major limitations:

  1. Replacing state access instructions with contract calls severely bloats the bytecode size of some contracts. Ethereum has a ~25kB limit on contract size meaning that after being translated from EVM to OVM some contracts would require refactoring or need to be broken up in order to make it possible to redeploy them during a fraud proof.
  2. Making calls instead of using native state access requires a lot more gas, so transactions running on the OVM effectively had a lower gas limit than their equivalents being run on Ethereum.

Both of these limitations affect contract developers and users, requiring a lot of effort for developers to re-write contracts that were designed to run on Ethereum mainnet. Given that contract development is an expensive, time consuming, and security intensive process these limitations pose a significant hurdle to the adoption of Optimism V1 and is the main reason for their move towards interactive fraud proofs.

Arbitrum

A rollup currently using interactive fraud proofs in production is Arbitrum. Their protocol allows a defender and challenger to bisect each other’s assertions until they find a single step they disagree on, then execute it on-chain. The size of the single step chosen by Arbitrum is that of a single instruction. Executing a single instruction of a virtual machine requires providing access to the internal state of the virtual machine. This includes the stack and memory, as well as state and globals. An interpreter for the virtual machine must be written that can run on Ethereum, and execute single instructions.

Since they would need to write a virtual machine interpreter Arbitrum chose to write a new one optimised for their use case of proving execution of single steps, rather than writing an interpreter for the EVM. An example of an optimisation they made was to change the structure of the VM memory to be immutable, so it can always be accessed in constant time rather than logarithmic as the current EVM structure would require. They called this rollup optimised virtual machine the AVM (Arbitrum Virtual Machine).

Lifetime of a contract in Arbitrum

AVM contracts and transactions do not have the same limitations as OVM ones, since transactions are never re-executed in full they can even exceed the limits on Ethereum mainnet if Arbitrum nodes choose to allow it. Writing a new VM has other advantages. Since anything written in AVM code can be proven on Ethereum, the Arbitrum developers can include new opcodes as long as they can be backed by EVM opcodes. This allows them to easily incorporate other node behaviour, like reading transactions from their Inbox contracts, into the set of operations that can be proven correct. Arbitrum can use the same type of fraud proof for any behaviour of their node they wish to make provable.

One major disadvantage of writing a new VM is that it may not always be compatible with Ethereum, and where compatibility is possible extra effort may be required to achieve it.

Diverging from Ethereum

Both of the above approaches diverge from the EVM, which poses engineering challenges for Arbitrum, Optimism and their users. Developers writing contracts for the EVM will expect their contracts to run on the AVM and the OVM, they’ll expect existing tools to work and JSON-RPCs to be consistent with Ethereum. In order to achieve this the rollups may need to shim node JSON-RPC endpoints to behave consistently, which is possible in some cases but not in others.

An example of where Arbitrum have achieved parity is in accepting EVM transactions. Arbitrum have written an EVM to AVM compiler which itself runs on the AVM. This means that transactions signed and submitted in EVM bytecode can be compiled in a provable fashion as part of their execution.

An example of where neither rollup has so far been able to achieve complete parity is in the gas semantincs: Optimism has replaced some opcodes which may require more gas, and Arbitrum has tuned gas costs to reflect the cost of on the AVM.

Converging on Ethereum

Optimism’s choice of fraud proof has led to significant restrictions for their users, and presents a barrier to adoption. They’re now turning to interactive fraud proofs to remove those limitations..

However, unlike Arbitrum’s current version, they don’t want to design a new virtual machine but instead try to find a representation of the EVM that can be efficiently proven on-chain. Doing so is an attempt to retain 100% compatibility with the EVM and Ethereum. They may then be able to make minimal changes to existing Ethereum client codebases and have them run as rollup clients, inheriting the security and hard work that has been put into them. It should also mean that they achieve compatibility with all existing JSON-RPC endpoints and the tools that rely on them without any additional effort.

It appears Arbitrum have realised the value of this approach as well, and in their latest blog post they’ve outlined an approach that will move them in that direction.

There are number of ways this could be approached:

  1. Extend the instruction set to include full block validation, and break down any large single steps into smaller sub steps. The Macula project is an attempt at this. It aims to generate an execution trace for the full execution of an Ethereum block, then write an interpreter for this execution trace. Thereby allowing all operations in a block to be bisected to a single step and then executed on-chain. It deals with opcodes that may not be efficient for proving, such as copying memory, by breaking them down into sub-instructions the sum of which would equate to the execution of the original instruction.
Lifetime of a contract in Macula
  1. Use an existing architecture more amenable to proving and implement an EVM interpreter for it. The Cannon project extracts the parts of Geth used for verifying a block. It then compiles those parts of Geth to an instruction set more amenable to proving — such as MIPS or RISC-V. When this MIPS-Geth is used to verify a block an execution trace is recorded.It is this execution trace that two parties can bisect to find a single instruction they disagree upon, which would then be executed on chain using an on-chain MIPS interpreter. This also appears to be the direction Arbitrum plan to take, however instead of MIPS they intend to use the WASM standard.
Lifetime of a contract in Cannon
  1. Create a zero knowledge proof of the single step execution. Instead of executing the single step on chain the contract could instead verify that a proof of execution of that single step is valid. Proving large state transitions, such as full transactions, in the zk is currently not possible in short time periods, however producing proofs for a single step should be feasible. This approach is currently being investigated by researchers at Oasis Labs and UC Berkeley, but was also described in the Arbitrum paper.
Lifetime of a contract with a single step zk

Each of these methods will provide fraud proofs that have no adoption hampering limits for users and allow rollup nodes to run existing implementations of Ethereum inheriting their security, maintainability and compatibility with Ethereum mainnet.

But they all require the more complicated interactive style of fraud proof. So is it possible to achieve the above properties using a non-interactive fraud proof?

Well actually, yes! There are currently two known approaches to this:

Zero knowledge proofs for the full state transition. There are still a number of challenges that need to be overcome before zk proofs for full transactions can be produced in short (block time) periods of time. However if the zero knowledge proof is being used as a fraud proof it doesn’t need to be efficient to produce the proof. Currently fraud proofs provide a long challenge period (~1 week) to give time for challengers to validate the current chain, generate a fraud proof and submit it to Ethereum. Even if generating a fraud proof takes a very long time (eg 1–2 days), then this challenge period can be extended with little impact to the overall protocol. Like the interactive fraud proofs, this style of fraud proof can exceed the limits of L1. This approach is currently being researched by Sumo, Consensys.

Lifetime of a contract with a full state transition zk

Bare metal fraud proofs. Optimism’s original goal was to execute EVM code on the EVM, but they were unable to since the mainnet EVM wouldn’t have access to the rollup state. But is it possible to design a system where this would be possible?

This is the approach taken by Oasis Labs. Their blockchain separates execution into a fast and slow path. Sub-committees of the consensus group execute transactions against a state root. The resulting state root is then stored on a base chain that’s validated by the full consensus group. If any member of a sub-committee disagrees with the resulting state root they can request the full consensus group re-execute any disputed state transition. A single honest validator is required in a sub-committee in order to inherit the security of the base chain — the same security as optimistic rollups.

This approach fits neatly into Ethereum’s stateless client concept. In this paradigm consensus nodes only store a state root of the chain. Transactions are then accompanied by the state they access, and a witness to prove the state is part of the state root. Stateless clients accept transactions accompanied by state and execute them against their stored state root to produce a new state root. If instead of executing received transactions against their stored state root they executed them against a provided one then they would be able to re-execute transactions for other Ethereum compatible chains. This functionality could be added as a new transaction type that behaved as follows:

  1. The data contains a block of rollup transactions with accompanying state and witnesses, and a state root.
  2. The client executes the block of transactions against the state root to produce the next state root.
  3. The tuple of (prev state root, transaction block hash, next state root) would then be stored in a global contract allowing any smart contracts to lookup this information and perform slashing etc. based on the results.

With a new transaction type like this on stateless clients, Ethereum would be able to fully re-execute state transitions for other Ethereum compatible chains. It could offer a simple non-interactive fraud proof for Ethereum compatible optimistic rollups. This fraud proof runs directly on the EVM, rather than inside a virtualised environment hosted in the EVM — a bare metal fraud proof.

Lifetime of a contract with bare metal fraud proofs

Summary

Optimistic rollups require fraud proofs to remain secure. We’ve taken a look at two popular rollups and seen how their choice of fraud proof has either caused limitations for their users and contract developers (Optimism), or not maintained complete compatibility with Ethereum (Arbitrum). Both of these rollups are moving towards complete compatibility with Ethereum, without any new limitations, and are using interactive fraud proofs to do this. However interactive fraud proofs are more complicated to construct safely, and there are current research avenues to find efficient non-interactive fraud proofs that do not impose additional limitations and can still remain compatible with Ethereum.

Thanks to Patrick McCorry, Dawn Song, Bennet Yee and Nicholas Liochon for their feedback and input.

--

--