Illustration: Optimus Prime vs. Wing Zero
Disclaimer: the author of these lines is working on ZK-Rollup since the inception of the concept and is inevitably biased. However, my experience puts me in a good position to deeply analyse and compare both solutions from the technological standpoint.
TL;DR: brief summary
Optimistic Rollup is a promising technology for scaling general-purpose smart contracts on Ethereum in the near term. If built relatively quickly, it can offer an easy way to migrate existing dapps and services with a reasonable degree of security/scalability tradeoffs. This will enable ETH 1.0 to keep up with growing demand.
ZK Rollup is a more sophisticated technology. It can be used for token transfers and specialized applications today. However, it will take a little longer to implement general-purpose smart contracts, and even more research work is required to efficiently wrap EVM in zero-knowledge proofs. However, once ZK Rollup is fully developed, all existing Ethereum dapps and services will be able to smoothly migrate to it without much effort.
ZK Rollup will fix several fundamental issues with Optimistic Rollup:
- Eliminate a nasty tail risk: theft of funds from OR via intricate yet viable attack vectors;
- Reduce withdrawal times from 1–2 weeks to a few minutes;
- Enable fast tx confirmations and exits in practically unlimited volumes;
- Introduce privacy by default.
Optimistic Rollup is great news for ZK Rollup. The transition to L2 scaling requires significant changes in wallets, oracles, dapps and user habits. Optimistic Rollup can help to prepare the ecosystem for this move, bringing scale to those dapps that cannot yet be built on ZK Rollup today. This will give ZK Rollup time to mature and make its adoption completely seamless, while maintaining Ethereum’s growth momentum.
What is a Rollup?
A Rollup is a Layer-2 scaling solution similar to Plasma: a single mainchain contract holds all funds and a succinct cryptographic commitment to a larger “sidechain” state (usually a Merkle tree of accounts, balances and their states). The sidechain state is maintained by users and operators offchain, without reliance on L1 storage (which is the source of the biggest scalability win).
What differentiates Rollup from Plasma is that it solves Plasma’s huge problem — data availability — by publishing some data for each transaction via the L1 network (in Ethereum specifically tx CALLDATA is used for this purpose). Thousands of transactions can thus be bundled up (rolled up) together in a single Rollup block. While this approach grows strictly linear in costs (O(n) of the number of transactions), it provides a practical 100-fold improvement in throughput, because CALLDATA is way cheaper than L1 storage and computation.
Rollup has been repeatedly endorsed by Vitalik Buterin as his favorite Layer-2 scaling solution.
Depending on how the correctness of state transition is guaranteed, there are two Rollup flavours: ZK Rollup and Optimistic Rollup. A brief history of both solutions is nicely presented here.
What is a ZK-Rollup (ZKR)?
In a ZK-Rollup, operator(s) must generate a succinct Zero-Knowledge Proof (SNARK) for every state transition, which is verified by the Rollup contract on the mainchain. This SNARK proves that there exists a series of transactions, correctly signed by owners, which update the account balances in the correct way, and which lead from the old Merkle root to the new one. It is thus impossible for the operators to commit an invalid or manipulated state.
More technical details can be found here and here. You can play around with the live demo of Matter Labs’ ZK Rollup for ERC-20 token transfers.
What is an Optimistic Rollup (OR)?
In an Optimistic Rollup, the new state root is published by operator(s) without being checked every time by the Rollup smart contract. Instead, everybody hopes that the state transition is correct. However, if an incorrect state transition is published, other operators or users (who MUST observe what’s going on in the L1 Rollup contract, executing every single transaction) will be able to point to the invalid transaction and revert the incorrect block, slashing malicious operators.
Flexibility: general-purpose computation
Although Optimistic Rollup could be used for specialized applications, the most important innovation of the Plasma Group is the OVM: Optimistic Virtual Machine. OVM enables the implementation of arbitrary smart contract logic. Almost anything that is possible in Ethereum is also possible in the OVM, including composability of smart contracts. It can be based on EVM, EWASM or any other virtual machine.
The nice thing about OVM is that if used with EVM, it will support writing code in Solidity. Because of this, large parts of existing codebase can be ported onto OR with little effort.
It would be ideal if OVM could directly reuse existing EVM bytecode, but it’s probably not that simple. A proper implementation will require changes of the transaction data (CALLDATA) format and sophisticated Truebit/Plasma Leap style implementation of challenge/response protocol for fraud proofs. This is likely to lead to divergence from EVM to properly handle the edge cases, meaning that some work will still be required to adapt existing contracts for OVM.
Another challenge to implementation lies in the fact that fraud proofs for large blocks can require more gas than permitted by the L1 block gas limit. These fraud proofs must then be broken down into multiple ETH transactions.
All existing implementations of ZK-Rollup (including the one by yours truly) have so far focused only on specialized operations such as token transfers or atomic swaps. There are several major reasons for this.
First, there was no efficient technique for succinct recursive proof composition for different zero-knowledge proofs (ZKPs), which would be required to aggregate the execution of different smart contracts in a single block. The best we had was Groth16 over the cycles of elliptic curves (used by Coda), which required computation over long fields and would be totally inefficient for large computations.
Second, even if we had shorter fields, Groth16 would require a separate trusted setup ceremony for each smart contract and for each new version! Obviously, this would be absolutely unrealistic.
The only efficient ZKP technique we had without trusted setup was FRI-based STARKs. However, the verifier is succinct to only a limited class of problems (expressible as succinct arithmetic circuits). A STARK verifier must execute each constraint of the computational statement being proved at least once, which means we cannot iterate over a collection of heterogeneous smart contracts.
All of this started to change with the advent of SNORKs, a new generation of ZKPs based on a slightly different set of cryptographic primitives — most notably, polynomial commitment schemes. Pioneered by Sean Bowe in Sonic, it was followed in summer 2019 by PLONK and Marlin. All of them have one thing in common: while trusted setup is still required, it now would be universal and updateable. Done once, it could be reused for any number of different programs at any time.
However, the Kate polynomial commitment scheme used in these proof systems would still require efficient cycles of elliptic curves for recursion, which are currently not available. This is why we are super excited about the most recent, fully succinct and transparent (no trusted setup) proof systems, such as Halo, SuperSonic, Fractal, and something exciting Matter Labs team is currently working on.
Long story short: the barriers to building general purpose smart contracts on ZKPs have now been removed. ZK Rollup is perfectly able to support the same programming model as EVM (including seamless composability and interoperability). The first contracts will likely require specialized DSLs, although the learning curve for Solidity developers won’t exceed 1 day. Eventually, given the current pace of advances in ZKPs prover technologies, we expect all existing ETH (and even EWASM) contracts to be efficiently portable with minimum effort.
Scalability & transaction costs
- According to John Adler, the current estimate is about 4k gas per transfer tx, post EIP2028/Istanbul.
- Which translates into ~100 TPS
- With BLS signature aggregation, this number can go up to ~500 TPS (in order not to break EVM compatibility tx params will likely remain long).
- If the EVM compatibility is broken, the throughput could theoretically grow up to the limits of ZKR.
Realistic throughput cap (token transfers): 500 TPS.
This is probably fine for now.
- The cost of public data per transfer tx in Matter Testnet is currently 16 bytes, which will cost 272 gas post EIP2028/Istanbul.
- Additionally, there will be an amortized cost of the proof, estimated at something like 300k gas.
- Even if we assume a worst case scenario with a 1M gas proof cost, the estimated ceiling will still be over 2140 TPS for transfers.
- In some discussions I have heard people argue that ZKPs incur significant computational overhead and are therefore expensive. In reality, the computational cost is negligible compared to the cost of gas, which is the real bottleneck because of censorship-resistant decentralization. We also expect this factor to go down significantly with time.
Realistic throughput cap (token transfers): over 2000 TPS — Visa-like scale.
However, for a lot of use cases ZK Rollup will offer much more significant savings, because large pieces can be omitted from the public data (by moving them to the ZK circuit witness), which are not required to reconstruct the state transition delta. The core insight is this: while OR always requires users to publish the complete transaction input, in ZK we can flexibly choose between 1) transaction input minus witness not affecting the state transition, and 2) transaction output only. This choice can be implemented quite elegantly and without a lot of complexity.
- In multisig wallets, wallets with Argent-style account abstractions or decentralized exchanges, users need to submit signatures to be verified by the contract. These sigs are not required for state delta updates and can be omitted from public data.
- Contracts like Gnosis’ Dfusion dutch DEX require large dataset inputs which do not directly affect storage, but are only used to verify the results of computations.
Post ETH 2.0
Since any Rollup will reside in a single shard, it is unlikely that the costs of CALLDATA (and thus Rollup transaction costs) will change much, unless bandwidth generally becomes cheaper.
Both Rollups are equally well suited to support meta-transactions and account abstraction.
Unlike payment channels, all funds in a Rollup are held by a single smart contract. Since Rollup is IMHO the most promising scaling direction, we should see a large number of users moving into it and a lot value being concentrated in this one contract. With tens or hundreds of Millions (or maybe even Billions) of dollars worth of assets at stake, the Rollup contract becomes an extremely attractive honeypot for high profile hackers. Under these conditions, if an attack has good chances, it will probably be attempted no matter how intricate.
The security model of OR is based on two assumptions:
- At least 1-of-N honest participants who execute all OR transactions and will submit fraud proof in case invalid state transition is published;
- Strong censorship-resistance of underlying L1 network.
1-of-N honest participants
As for the first part, it’s realistic to expect that only the operators of the Rollup will be actually monitoring and executing transactions. Normal users will have neither incentives to do so, nor technical capabilities to process transactions at high load (if they could, where would the scaling come from?). Luckily, operators are naturally incentivized to check each other’s blocks for correctness, because creating a block on top of an invalid one is a slashing condition.
1-of-N honest operators is a reasonable assumption with enough credible participants. However, since the number of active participants is limited (hundreds?), some sophisticated attacks could include: targeting the infrastructure of all the operators (very hard but not infeasible), bribing/blackmailing Devops engineers to secretly install malicious code, targeting update distribution channels for rollup software, etc., and of course a combination thereof. These attacks are hard and should be actively protected against, but they are much more realistic than, let’s say, trying to attack Ethereum miners in the same way — especially because a successful attack on OR will go unnoticed until completion.
Strong censorship-resistance of L1
The second assumption is a tricky one. Indeed, the design of Ethereum provides economic mechanisms which are very effective in preventing ordinary censorship. Yet, these mechanisms stop functioning in the presence of anti-mechanisms. An attacker can create a fully automated bribing mechanism to coordinate a 51% attack by miners, which will prevent honest miners from including fraud proofs in their blocks. Interestingly, the direct cost of this attack for participating miners is zero, not counting social costs which might arise from the response of an angry community if the censorship can clearly be attributed. This part is also tricky, because the mechanism elegantly provides plausible deniability to the participants in the attack: “Given the credible commitment by the attacking majority, if I don’t participate, my blocks will be abandoned, so I must be doing this not for the sake of profit, but rather to avoid losses”.
I invite the reader to follow a discussion of this attack and a recent analysis of 51% censorship attacks by Vitalik Buterin. Below I will share some interesting insights.
This type of attack is, unfortunately, very realistic under PoW. There is no effective way to punish anonymous miners for participation in it.
After the transition to PoS, the community will be in a position to punish censoring miners by slashing their stake, if the broad social consensus is reached on this. After all, a censorship attack like this could be considered an aggression against the entire network (although one could also argue that the miners simply honestly follow the protocol and are not obligated to behave in any way contrary to their best economic interests). However, post DAO-fork, this will be a very controversial discussion, to say the least, with an unpredictable outcome. In a recent community poll by Vitalik, 63% voted strictly against any manual intervention in the immutable blockchain to bail out users, regardless of the extent of the attack. Needless to say, wiping out the stake of even one validator would be extremely difficult to push through, let alone wiping out the stake of the majority.
Update Nov 26 2019: more research on collusion has recently been published, as well as a new attack on fraud proofs in PoS environment, which demonstrate that the censorship attack risks for OR in PoS are at least as high as in PoW.
A more realistic way to withstand this kind of attack is a quick mobilization of the community in a UASF (user-activated soft fork) to force miners to include certain transactions. This scenario is complicated from both an engineering and a social perspective, and will definitely require a relatively long challenge period window for fraud proofs — minimum 1 week, better 2. At the same, given that the major DeFi operators are effectively in a position to decide the outcome of such a fork, and that it is in their best interest to avoid loud disrupting events, their best bet could be to just silently comply with the attacker (which will keep Ethereum on the longest chain and produce a less controversial result than a successful soft-fork).
To summarize: the risks of fraud proof censorship are relatively low but non-negligible. With a 1–2-week fraud proof challenge period and not too much money at stake, the OR is probably fine: operator/miner collusion is not going to be worth the hassle and the risks. However, as the value in the rollup grows, the lurking Black Swan would become more and more worrisome, at least to people as paranoid as yours truly.
In a ZK Rollup, every state transition is verified by the Rollup smart contract before it becomes effective. It is strictly not possible for operators to steal the funds or corrupt the Rollup state. ZKR relies on the censorship-resistance of L1 only for its liveness, not for its security. There is no need for anyone to monitor the ZKR: after a block is verified, user funds are always guaranteed to be eventually retrievable even if operators refuse to cooperate.
Thus, ZKR embodies more fully the foundational ideals of crypto: achieving resilience by replacing trusted parties with cryptography and game-theoretical incentive alignment.
For completeness, however, I must mention several other potential risks specific to ZKR.
If the ZKPs used in a ZK Rollup require a universal trusted setup, we end up with the 1-of-N honest participants assumption. This might or might not be an acceptable risk, depending on the number and quality of the participants. But safe is safe, which is why I’m very excited about the recent advances in efficient trustless SNARKs, especially the construct we at Matter Labs are currently working on.
The newest generation of SNARKs is using multiple more solid and battle-tested cryptographic primitives than Groth16. Matter Labs’ work mentioned earlier is based on FRI and is therefore even plausibly post-quantum secure. However, to be completely calm, two mitigation strategies should be applied:
- A large bounty must be deployed with much lower security parameters than the actual production version, similar to the RSA challenge. If a practical attack is ever discovered, the challenge will be broken by researchers years before breaking the production code becomes feasible.
- All state transitions must be sendable only by the operators of the ZKR, who will essentially serve as a 2-Factor protection layer.
Due to the problems mentioned in the security section above, Optimistic Rollup can only be safe with a 1–2-week fraud proof challenge window. No transaction can be considered final until this time passes — neither an internal Rollup tx nor an exit.
Unfortunately, there is no quicker way for an end-user to check whether the transaction is final or not, than by executing all the transactions for the entire last challenge period. It’s important to note that users cannot rely on pure game-theoretical guarantees of block finalization, because a bug (or a hack) in a node of a single operator can still lead to reverts.
Time-to-finality (under PoW): 2 weeks.
Time-to-finality (under PoS): 1 week.
Currently ZKPs are quite computationally intense. At present, for a block of 1000 tx we can have 20 minute proof generation time on ordinary server hardware.
Ongoing GPU prover implementations (by Matter Labs and Coda) promise to increase tx speeds by at least ~10x. In the not so distant future, specialized hardware will be likely to boast a much higher computational power. Eventually, we expect to see block confirmation under 1 min.
Time-to-finality (now): 20 min.
Time-to-finality (future): under 1 min.
Fast confirmations for intra-Rollup transactions
In both types of Rollup, it is possible for operators to issue instant transaction confirmations to the users by putting up certain security deposits which will be slashed if the transaction is not included in the promised block. This provides an economic guarantee to finality.
This approach has several limitations. It works well for transfers of fungible tokens, but it gets difficult with NFTs (which might have no market value, or when the owner of such assets would not want to “sell” it immediately under any circumstances) and generalized contract calls (because it’s not easy to exactly quantify the monetary value if some previous transaction in the chain gets reverted; a simple example: how much of the operator’s money should be at stake for you to accept a stablecoin oracle price broadcast as final?)
Fast exits are similar to fast intra-Rollup confirmations. Operators can cooperate with liquidity providers to initiate withdrawals of fungible tokens to users immediately, without waiting for the exit transaction to become final in the Rollup.
This requires a significant amount of collateral which will be proportional to the time-to-finality. Assuming realistic near-future finality times of 1 week for OR and 5 min for ZKR, OR would require 2000 times more collateral to support the same weekly withdrawal volume as ZKR.
Optimistic Rollup can support any privacy solution available on L2 Ethereum (mixers, etc). Since OR itself is L2, any privacy solution implemented on it will live as L3. This might lead to even more fragmentation of privacy services, and as a result to small anonymity sets, which renders the utility of privacy very low (as we can observe even with zcash, where transactions are not shielded by default).
To achieve real privacy, systems must support it by default. From the technological perspective, ZKR can at some point easily support confidential transactions for token transfers at the protocol level by default, as well as differentiate between public and private smart contracts (ZK ZK Rollup style).
At the same time, building fully anonymous transactions zcash-style (i.e. hiding not only amounts, but also the participants of the transaction) would require changing the storage model of ZK Rollup from account-based to UTXO-based, which would create too many problems and is unlikely to happen.
Optimistic Rollup is currently in the PoC stage. We will hopefully see production-grade implementation coming soon. If it turns out to be relatively easy to port existing code, projects will gradually start adopting it and building new infrastructure: L2 support will appear in wallets, oracles will start broadcasting to OR, etc.
ZK Rollup is already more mature with regard to specialized applications (such as ERC-20 token transfers), but will travel a more gradual path with fully generalized smart contracts. Eventually, it will be possible to port any EVM- and WASM-based smart contract to ZK Rollup — and at the current pace of technological development, this is not likely to take years.
Similar infrastructure changes in wallets, oracles and other smart contract components must be made for both Rollup types. This requires a significant amount of work which will be accelerated as more projects become interested in L2 scaling tech. Since Optimistic Rollup makes a promise of generalized EVM-based smart contracts earlier than ZK-Rollup, it will provide a huge boost to the community’s motivation to adopt L2.
For users and dapps, jumping from one Rollup to another will be easier than the initial migration from ETH to L2. Bridges will make this process even smoother. Because of this ease of switching, my personal take is that the solution develops a significant edge in UX will likely become the sole winner in the long term.
No matter the outcome, this is going to be a very important and exciting evolution to observe. And the ultimate winner will be the Ethereum community in any case.