ETH Scaling 2: How the Op Rollup and Zk Rollup Work

Gerald Lee
14 min readJul 12, 2023

--

The main difference between Layer 2 solutions and independent chains lies in the data availability (DA). In Layer 2, the DA is controlled by Ethereum, while Layer 2 is only responsible for computation. This means that the security of user data is fully guaranteed by Ethereum, and Layer 2 acts merely as a tool and cannot act maliciously. At most, it can annoy you, for example, by not including your transactions in a block (censorship).

However, having Ethereum control the data availability (DA) also has its drawbacks. The biggest issue is that its scalability is far inferior to that of independent chains. Ethereum’s transactions per second (tps) cannot match those of independent chains, and the gas fees are also higher compared to independent chains. After all, Layer 2 needs to package transactions onto the Ethereum network, which consumes both time and gas.

In addition to that, Layer 2 mechanisms share many similarities with independent chains. For example, Optimistic Rollup is similar to Plasma, and ZK Rollup is similar to Validium.

The mainstream Layer 2 solutions are as follows:

  • Optimistic Rollup: This is the Layer 2 version of Plasma. Optimistic Rollup employs a fraud-proof mechanism.
  • ZK Rollup: This is the Layer 2 version of Validium. ZK Rollup utilizes zkSNARK or zkSTARK algorithms.

Optimistic Rollup

Optimistic Rollup inherits the fraud-proof mechanism from Plasma to ensure data validity. Its basic idea is to optimistically accept the operator’s submitted results on Ethereum and then wait for third-party challenges to be submitted.

Working principle

Referring to the diagram below, the working principle of Optimistic Rollup (op rollup) can be described:

https://medium.com/ethereum-optimism/ovm-deep-dive-a300d1085f52

As you can see, the overall approach is similar to Plasma:

  1. Op Rollup periodically submits transaction data and state root to Ethereum.
  2. Optimistic: Ethereum initially optimistically accepts the operator’s submissions by putting the transaction data and state root on-chain.
  3. Fraud proof: There is a window period (challenge period), typically around 7 days, during which third parties can submit fraud proofs and validate potential fraud.
  4. Incentive mechanism: Rewards and penalties are determined based on the outcome of challenges.

The comparison between op rollup and Plasma is as follows:

In addition, Optimistic Rollup (op rollup) has some notable features:

  1. Proof-of-Stake (PoS) Mechanism: The selection of operators (validators) often utilizes a PoS mechanism, which facilitates the implementation of incentive mechanisms. However, there can also be fully centralized operators.
  2. Data Compression: Operators need to package and submit all transaction data to Ethereum. To save storage space, data compression techniques are employed. There is a good example that demonstrates how the data submitted to Ethereum by op rollup is decoded.
  3. OVM (Optimistic Virtual Machine): In the event of fraud, corresponding layer2 transactions need to be re-executed on Ethereum. Op rollup incorporates an OVM (referred to as OVM in Optimism and AVM in Arbitrum) as an embedded contract on Ethereum, which simulates the layer2 environment and re-executes the corresponding transactions. Here, you can find a detailed explanation of how the OVM works.
  4. Challenge Count: When fraud occurs, there may be multiple rounds of interaction between the challenger and the asserter, typically involving two strategies:

(1). Single-round: This represents the Optimism product. It involves a single round of interaction where the challenger issues a challenge, and the OVM re-executes the transaction to determine the final outcome. This process is relatively simple but comes with higher costs. The OVM needs to re-execute the transaction, requiring the challenger to provide additional data to Layer 1 Ethereum, which increases the on-chain data and incurs corresponding gas fees.

(2). Multi-round: This represents the Arbitrum product. The core idea is to divide the execution of transactions into multiple steps and find the smallest executable step. The process is as follows:

  • The challenger initiates a challenge against a specific transaction.
  • The asserter (the one being challenged) splits the transaction using a bisection protocol and waits for the challenger to proceed.
  • The challenger challenges the problematic half.
  • The asserter further subdivides the steps into halves and waits for the challenger to challenge the problematic half. This process continues until the challenger determines the smallest step with an issue.
  • The smallest step is re-executed on Layer 1.
  • The challenge concludes.

It is important to note that the entire multi-round challenge process takes place on Layer 2 (l2), while Layer 1 (l1) is responsible for verifying the validity of the challenges and only re-executing the smallest challenge unit. Considering that Layer 2 performs better in terms of gas fees and execution speed compared to Layer 1 Ethereum, the multi-round interaction is superior to single-round interaction in terms of cost and execution efficiency. This reduces the burden on Layer 1 and improves overall cost-effectiveness and execution efficiency.

5. Liquidity Provider (LP): Introducing a solution to address the slow withdrawal time of liquidity providers. The LP acts as an intermediary where users can transfer their assets that are in the process of withdrawal to the LP. Once the LP confirms the undisputed nature of the assets, they directly provide funds to the user, charging a certain fee. The LP then handles the specific challenge process.

Summary of Optimistic Rollup

  • Preventing wrongdoing using economic models. Adopting fraud-proof mechanisms, the blocks on layer 1 confirm (finality) and have a challenge period of 7 days.
  • Withdrawals are slow and require a 7-day waiting period.
  • Liquidity Providers (LPs) can effectively address the issue of slow withdrawals.
  • Smart contracts are supported.
  • Ethereum controls the data availability (DA).
  • Optimistic rollup has a simple process and high user acceptance, currently ranking higher in Total Value Locked (TVL) than zk rollup.

ZK Rollup

Zk rollup adopts zkSNARK or zkSTARK mathematical algorithms to ensure the security of data, which is different from optimistic rollup that relies on economic models for security. zk rollup is more efficient and secure, but it has higher technical and hardware requirements, making it less favorable for operator participation.

Working principle

ZK algorithm

The zk algorithm is the core of the entire zk rollup and is highly complex. Taking zk-SNARK as an example, the core idea of the algorithm can be briefly described as follows:

  1. Verifier and Prover

In the zk-SNARK algorithm, as shown in the following diagram, there are two roles:

https://medium.com/@kotsbtechcdac/introduction-to-zero-knowledge-proof-the-protocol-of-next-generation-blockchain-305b2fc7f8e5

1). Prover: Submits a proof to the verifier to demonstrate its validity.

2). Verifier: Verifies the validity of the proof.

Vitalik has written a brilliant article here: (https://medium.com/@VitalikButerin/quadratic-arithmetic-programs-from-zero-to-hero-f6d558cea649)

And we will reference his examples. The entire zk-proof process is as follows:

1). There is a formula f(x) = x³ + x + 5.

2). The prover wants to prove two conclusions:

Conclusion 1: The prover claims that when the input x is 3, f(x) equals 35.

Conclusion 2: The prover has performed the complete computation.

3). The prover submits the proof to the verifier.

4). The verifier validates the honesty of the prover through the proof.

2. Constrain and Polynomial

It is crucial to note what the prover wants to prove here:

Conclusion 1: The prover claims that when the input x is 3, f(x) equals 35. Conclusion 2: The prover has executed the complete computational process.

Let’s see how to prove the two conclusions:

  • Conclusion 1, The prover claims that when the input x is 3, f(x) equals 35.

This conclusion seems relatively easy to prove. The verifier can directly substitute the provided value x=3 into the formula and perform the computation. This is possible because a simple formula is being used. However, if the computation were time-consuming, let’s say it takes 10 minutes to compute once, then the cost of verifier recomputing it would be significantly high. The verifier can validate the computation result without re-executing the computation. This is a necessary condition, as in the field of computation, it falls under the category of NP problems, where only the result is verified without solving it.

In the context of a real zk-rollup scenario, the prover is responsible for transaction execution, while the verifier does not re-execute the transactions for validation. In this case, Conclusion 1 needs to be proven through Conclusion 2.

  • Conclusion 2, The prover executed the complete computational process.

The purpose of this conclusion is to infer Conclusion 1. Now, think about how to prove a result correct without recomputing it. The approach taken by the prover here is to divide the equation into individual small steps, where each step is called a constraint. There are four constraints as follows:

sym_1 = x * x
y = sym_1 * x
sym_2 = y + x
out = sym_2 + 5

Then, the prover hands over the inputs and the generated values (x, sym_1, y, sym_2, out) from each small step as proofs to the verifier. These values are intermediate values of the equation, which means that if (x, sym_1, y, sym_2) is correct, then out must also be correct.

The verifier can draw the conclusion:

1). The prover definitely executed the complete computational process. Otherwise, there wouldn’t be correct intermediate values.

2). When x=3, f(x)=35 is correct because all the intermediate values are correct.

You must be thinking, if we verify all the intermediate values, wouldn’t it be equivalent to the verifier re-executing the logic? That’s right, but here let’s assume that there is a method where the intermediate values can be verified without performing the computations. In that case, the logic mentioned above holds: if the inputs and intermediate values are correct, then the result is correct. The reason is that they satisfy the constraints.

A constraint is equivalent to a relational description. For example, the constraint “sym_1 = x * x” describes the relationship that must hold between sym_1 and x.

For example, someone asks you to guess an animal and gives you a few constraints:

  • It quacks.
  • It is a domestic bird.
  • It can swim.

Based on these constraints, you would most likely guess that it is a duck. It could also be a goose because the constraints are not very specific, and some geese also quack.

Returning to the main topic, a constraint describes the relationship between the input and output. The core essence of the zk-SNARK algorithm is as follows: The prover provides a triple (input, output, intermediate values) as proof, and the verifier verifies the correctness of the output based on the constraint.

The next question is how to verify the intermediate values. Naturally, it is not feasible to execute each intermediate value, as it would be equivalent to the verifier re-executing the formula. Therefore, mathematicians introduced a trick, which can be simply explained as combining multiple constraints into a single final constraint. By executing this constraint, the verification can be completed in one step. The general process is as follows:

1). R1CS (Rank-1 Constraint System): The multiple constraints are organized into a system of equations, which is then transformed into vector operations.

2). QAP (Quadratic Arithmetic Program): Using multi-point value interpolation, the vector operations are transformed into a polynomial equation.

In this way, the four arithmetic operations mentioned above will ultimately be transformed into a polynomial equation. You can try it here (https://asecuritysite.com/encryption/go_qap). This achieves the goal of combining multiple constraints into a single constraint.

There are two advantages of introducing a polynomial equation as the final constraint:

  • By performing one computation, all intermediate values and results can be verified.
  • Different polynomial equations can have significant differences. Even a slight change in coefficients results in vastly different curve forms. This reduces the likelihood of conflicts, meaning that the solution to polynomial A will not coincidentally be a solution to polynomial B.

The above is the basic principle of the SNARK algorithm. However, due to limited space, there are many details not described, such as the introduction of PCP (Probabilistically Checkable Proof) random sampling. This means that not all constraints need to be executed; instead, a few constraints are randomly verified to check if they satisfy the conditions. For example, in the case of zk-SNARK, real data is hidden through homomorphic encryption, such as Elliptic Curve Pairing, which transforms the computation domain from number field to elliptic curves.

For further details of the algorithm, you can refer to the resource “Why and How zk-SNARK Works: Definitive Explanation”. https://arxiv.org/pdf/1906.07221.pdf

3. Circuit

Circuits are also a challenging aspect of zk-SNARK algorithms. Why do we need to use circuits?

In the previous example, we demonstrated how to prove a formula, but programmers deal with code, so they need to convert the code into formulas. In fact, all algorithms can be written in mathematical formulas, and those familiar with functional programming have a deeper understanding of this concept.

For example, there is a piece of code:

def qeval(x):
y = x**3
return x + y + 5

The conversion into a mathematical formula is f(x) = x³ + x + 5.

To perform zk-SNARK proof, further conversion is required to transform it into a formula that generates intermediate values.

sym_1 = x * x
y = sym_1 * x
sym_2 = y + x
out = sym_2 + 5

Upon careful observation, the transformed result actually consists of simple addition and multiplication operations, perfectly corresponding to the addition and multiplication gates in digital circuits.

Therefore, it is possible to skip writing the initial code altogether and directly express the logic using circuits. As a result, such code emerges (here is a complete circom example):

    //__1. verify sender account existence
component senderLeaf = HashedLeaf();
senderLeaf.pubkey[0] <== tx_sender_pubkey[0];
senderLeaf.balance <== account_balance;

component senderExistence = GetMerkleRoot(levels);
senderExistence.leaf <== senderLeaf.out;
for (var i=0; i<levels; i++) {
senderExistence.path_index[i] <== tx_sender_path_idx[i];
senderExistence.path_elements[i] <== tx_sender_path_element[i];
}
senderExistence.out === account_root;

//__2. verify signature
component msgHasher = MessageHash(5);
msgHasher.ins[0] <== tx_sender_pubkey[0];
msgHasher.ins[1] <== tx_sender_pubkey[1];


component sigVerifier = EdDSAMiMCSpongeVerifier();
sigVerifier.enabled <== 1;
sigVerifier.Ax <== tx_sender_pubkey[0];
sigVerifier.Ay <== tx_sender_pubkey[1];

//__3. Check the root of new tree is equivalent
component newAccLeaf = HashedLeaf();
newAccLeaf.pubkey[0] <== tx_sender_pubkey[0];

Exactly, this is a common transaction execution logic that has been implemented using circuits.

Therefore, in zk rollup, the purpose of using circuits is to break down the logic into individual intermediate steps, a process called flattening. This is done to generate constraints and intermediate values. Ultimately, it is transformed into a polynomial expression to participate in zk calculations.

4. Summary of ZK algorithm

In summary, the process of zk-SNARK algorithm is as follows:

1). Convert the code into circuits (constraints) to generate intermediate values (trace).

2). Transform the constraints into polynomials for proof verification.

3). The prover runs the circuit code to generate the proof, which includes inputs, outputs, and intermediate values.

4). The verifier utilizes the polynomials to verify the proof.

This way, the verifier can verify the correctness of the output without re-executing the computations. When combined with encryption algorithms, it becomes possible to verify the correctness of the output even when certain input or intermediate values are hidden.

Combining with the circom tool, we use the above description to correspond to a real zk rollup layer2 development process.

Please note that the constraints in the smart contract on L1 correspond to the circuit on L2. In other words, if the circuit on L2 is modified, it cannot pass verification on L1. Now, let’s consider the following: In zk rollup, can the operator steal my tokens? For example, if Alice transfers 10 ETH to Bob, can the operator change it to 5 ETH?

The answer is obviously no because Alice’s transaction will be used as an input for the zk algorithm. If it is tampered with, the signature verification circuit will not pass. Even if the operator tampers with the circuit, the circuit logic will be different from the smart contract deployed on lay1, so it will naturally fail to pass the verification.

A complex logic, especially blockchain transaction logic, becomes visibly complex when implemented using circuits. Additionally, it requires recording a large number of intermediate values (often referred to as traces or witnesses), which obviously consumes resources. Therefore, it is not surprising that there are two common problems with current zk algorithms:

1). High hardware requirements.

2). Difficulty in general computation, meaning that it is easy to support simple transfer transactions but challenging to support general EVM (Ethereum Virtual Machine) functionality. However, various zkEVM (Zero-Knowledge Ethereum Virtual Machine) solutions have emerged, gradually addressing this issue.

The ZK Rollup process

Once you understand the zk algorithm, the process of zk rollup becomes easy to grasp. Let’s compare it to Optimism.

https://trapdoortech.medium.com/l2-deep-dive-into-ovm-e2229052ed00

In terms of the process, it is evident that zk and op differ mainly in their proof methods. Op uses OVM to handle challenges, while zk employs zk proof. The zk rollup process is as follows:

1). ZK rollup periodically submits transaction data and state root to the Ethereum network.

2). Validity proof: Layer 2 submits zk proof, and Ethereum Layer 1 verifies the correctness of transaction execution results using algorithms.

Summary of ZK Rollup

  • Confirming transaction validity through algorithms eliminates the need for trust.
  • Withdrawals are fast, with finality taking around 10 minutes, primarily due to the time-consuming process of generating proofs.
  • Initially, supporting smart contracts was challenging, but the issue is gradually being resolved.
  • There are high technical and hardware barriers.

Comparison of ZKR and OPR

Brief comparison as follows:

Overall, it is difficult to determine the superiority or inferiority between zk rollup (zkr) and optimistic rollup (opr). While zk rollup seems to have architectural advantages, opr rollup currently leads in terms of Total Value Locked (TVL), mainly due to the following reasons:

1). Zk rollup is theoretically more secure but also more complex, resulting in lower acceptance compared to opr rollup.

2). Although opr rollup has longer asset withdrawal times, the participation of liquidity providers (LPs) has minimal impact on regular users.

3). Opr rollup has good support for the Ethereum Virtual Machine (EVM) and is more DApp-friendly, while zk rollup faces challenges in terms of support. Although several zkEVM solutions have emerged recently, they have not been widely accepted.

4). Opr rollup adopts an economic model, while zk rollup follows an algorithmic model. Although the algorithmic model is more secure, its market performance may not necessarily surpass the economic model. The economic model has advantages within the ecosystem.

5). In terms of gas fees, if the block has fewer transactions, zk rollup has higher gas fees because it requires additional computation for generating proofs, unlike opr rollup. However, in the case of a higher transaction volume, zk rollup may not necessarily have an advantage and depends on the data being stored on-chain. Generally, opr rollup includes all transaction data on-chain, which may give the impression of higher transaction fees compared to zk rollup. However, the reality is different. For example, Polygon zkEVM includes all transaction data on-chain, while zkSync and Starknet only include a portion of the data. Including all data on-chain does not provide an advantage over opr rollup, and although including only a portion of the data reduces gas fees compared to opr rollup, it is criticized for data availability (DA) issues.

--

--

Gerald Lee

I'm a senior software engineer with over 17 years of work experience, including the past 7 years in the blockchain industry.