Let’s dissect a zkSNARK! (part 2)

Yu Jiang Tham
9 min readFeb 15, 2023

--

This post continues from part 1, which reviewed what a zkSNARK is, as well as some of the math required to understand it. Now, we head deeper into the rabbit hole and take a look at going from a computer program to decomposing it into the elements required for a zkSNARK proof.

A rabbit on a journey.

30,000-ft picture

Before we go too far off the deep end, it pays to zoom out a bit and get a sense of what we’re looking at today. I’ve redrawn a diagram of the components of a proof system from Ying Tong Lai a ZK engineer and researcher who was at the Ethereum Foundation for a number of years:

Components of a proof system

Today’s focus will be on the left half, going through how we get correctness of computation through arithmetization and constraint satisfaction.

Zero Knowledge Circuits

In order to create a zkSNARK, we first must generate a arithmetic circuit. Unlike a hardware circuit board that you might be imagining, arithmetic circuits are software constructions that are essentially arithmetic circuits on finite fields (see the previous post for details) of a prime order (prime number of elements in the field), of which the prime number is chosen because it works especially well for elliptic curve cryptography.

Circuit constructions are utilized because they can easily be used for probabilistic checks by a verifier to ensure that the computations were correctly run. Because circuits can be defined as a series of polynomials, which are required for probabilistic checks, they have become the standard way to construct ZK proofs.

A hardware circuit board that is NOT like the circuits that we’re discussing

In order to visualize how a program could be turned into a circuit, we can take the following example. Imagine we have a variable in and want to check if in is equal to either 10, 15, or 25 and if so, return 1, otherwise return 0. Some pseudocode for this would look like:

if (in == 10 || in == 15 || in == 25) {
return 1;
} else {
return 0;
}

In order to do the same thing in an arithmetic circuit, we can express this as an equation (in-10)(in-15)(in-25) ?= 0. Here’s an equivalent circuit diagram:

Essentially, what’s happening here is that the left hand side of the equation will evaluate to 0 if in is 10, 15, or 25, and nonzero otherwise. The isZero component is a circuit that returns 1 if its input is equal to 0, and 0 otherwise.

(in - 10)(in - 15)(in - 25) ?= 0

Writing a Circuit

The natural question that arises from all of this is how does one go about writing a circuit for a zkSNARK? There are a number of domain-specific languages that are being used to accomplish this task.

Circom

Circom is a language and compiler that lets people write and compile arithmetic circuits for ZK proofs. It’s often used in conjunction with SnarkJS to generate keys with trusted setup parameters and to prove and verify with the compiled circuit output from Circom. Since Circom has been around for a while, it’s the gold standard for writing circuits for ZK proofs.

With that said, writing in Circom can take a little bit of getting used to. Circom is a constraint-based language and requires the programmer to constrain the inputs, intermediate signals, and outputs of the circuit to valid values. I’ll have another post that goes more in-depth on this another day.

Circom compiles down to R1CS (which we’ll discuss further down in this post), WASM, and JSON.

Noir

New on the scene from Aztec, a pioneer in the ZK space, is Noir. Noir is a Rust-like language that allows for programmers to pick up the language more naturally. For example, conditional branching is supported in Noir.

Another benefit touted by the Aztec team is that Noir compiles down to what they call an Abstract Circuit Intermediate Representation (ACIR), which will eventually allow you to utilize any SNARK-based proving system in the future.

Unfortunately, as of February 2023, there’s still a number of inconsistencies in their compiler and JS prover that make it difficult to deploy this in production. Hopefully the issues will be resolved soon.

Lurk

Lurk is a language from Protocol Labs/Filecoin that supports recursive zkSNARKs via Lisp-like expression. I haven’t dug too deeply into Lurk but it’s definitely been gaining adoption because Filecoin is such a large entity that has already been using zkSNARKs for a long time.

Arithmetization

Arithmetization is the process of taking a circuit description via a programming language and turning it into a system of mathematical constraints that enable fast verification. With this set of mathematical constraints, which we’ll dive further into below, a verifier can run a small number of algebraic checks to show with high probability that the prover is not lying. In this section we go from circuit -> R1CS -> QAP, which we’ll describe below.

Rank-1 Constraint System (R1CS)

One way of displaying arithmetic circuits is the Rank-1 Constraint System, or R1CS. By taking a circuit and transforming it into a series of matrices and vectors, we can eventually feed this into any proof system that supports R1CS to generate a zkSNARK.

To get an R1CS, we have to “flatten” our circuit into a series of mathematical constraints of the form left * right = output, or equivalently left * right — output = 0, where left is the left wire of an arithmetic circuit gate, right is the right wire, and output is the output wire.

left, right, and output wires for an arithmetic circuit gate

If we take our arithmetic circuit for (in-10)(in-15)(in-25) above, and simplify it so that we will only check if (in-10)(in-15)(in-25)=0, we can label our wires (excluding constants) and gates as such:

wires and gates labeled for our circuit (in-10)(in-15)(in-25)=0

Which gives us the following system of equations:

g0: w1 - 10 = w4
g1: w2 - 15 = w5
g2: w3 - 25 = w7
g3: w4 * w5 = w6
g4: w6 * w7 = w8

Now, a prover can come along and claim that they know a value of in that satisfies the circuit wherein w8 = 0 (put another way, the prover wants to show that they know an input value that causes the equation to equal to 0).

The prover will then set the in value to 15 and substitute this into the system of equations:

g0: w4 = 15 - 10,   w4 = 5
g1: w5 = 15 - 15, w5 = 0
g2: w7 = 15 - 25, w7 = -10
g3: w6 = 5 * 0, w6 = 0
g4: w8 = 0 * (-10), w8 = 0

The R1CS representation works because if we are able to show that these set of equations is correct, we then know that the original equation is correct. The prover can create a vector that contains all of the wire values, called a witness. The witness is the set of inputs, outputs, and all intermediate values of the arithmetic circuit. The witness vector also includes a w0 = 1 value by default (notice how we started counting from w1 in the diagram).

wires:     w0  w1  w2  w3  w4  w5  w6   w7  w8
witness = { 1, 15, 15, 15, 5, 0, 0, -10, 0}

Using the witness vector w, we can create a series of equations of the form (w . left) * (w . right) — (w . output) = 0 where . is the dot product operation (multiply each corresponding element in two vectors of the same size and sum the results to equal a scalar) which allows us to eventually generate vectors for the left, right, and output for each gate. Here’s how it will look for g0:

g0:
w [ 1,15,15,15, 5, 0, 0,-10, 0]
left [ 0, 1, 0, 0, 0, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
right [-10, 0, 0, 0, 0, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
output [ 0, 0, 0, 0, 1, 0, 0, 0, 0]

left * right - output = 0
(15 - 10) * 1 - 5 = 0

Note that in the case of an addition or subtraction gate in the above example, the addition/subtraction of the right wire gets combined in the left wire and the right wire is set to 1 to ensure the left * right — output equation holds. Also, note that w0 is used for constants, so the first element of right is equal to -10 in this case.

Now, here are the rest of the gates g1 to g4:

g1:
w [ 1,15,15,15, 5, 0, 0,-10, 0]
left [ 0, 0, 1, 0, 0, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
right [-15, 0, 0, 0, 0, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
output [ 0, 0, 0, 0, 0, 1, 0, 0, 0]

left * right - output = 0
(15 - 15) * 1 - 0 = 0
g2:
w [ 1,15,15,15, 5, 0, 0,-10, 0]
left [ 0, 0, 0, 1, 0, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
right [-25, 0, 0, 0, 0, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
output [ 0, 0, 0, 0, 0, 0, 0, 1, 0]

left * right - output = 0
(15 - 25) * 1 - -10 = 0
g3:
w [ 1,15,15,15, 5, 0, 0,-10, 0]
left [ 0, 0, 0, 0, 1, 0, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
right [ 0, 0, 0, 0, 0, 1, 0, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
output [ 0, 0, 0, 0, 0, 0, 1, 0, 0]

left * right - output = 0
5 * 0 - 0 = 0
g4:
w [ 1,15,15,15, 5, 0, 0,-10, 0]
left [ 0, 0, 0, 0, 0, 0, 1, 0, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
right [ 0, 0, 0, 0, 0, 0, 0, 1, 0]

w [ 1,15,15,15, 5, 0, 0,-10, 0]
output [ 0, 0, 0, 0, 0, 0, 0, 0, 1]

left * right - output = 0
0 * -10 - 0 = 0

We then collect the left, right, and output vectors to get 2D matrices for each:

left:
[[ 0, 1, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 1, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 1, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 1, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 1, 0, 0]]

right:
[[-10, 0, 0, 0, 0, 0, 0, 0, 0],
[-15, 0, 0, 0, 0, 0, 0, 0, 0],
[-25, 0, 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 1, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 1, 0]]

output:
[[ 0, 0, 0, 0, 1, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 1, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 1, 0],
[ 0, 0, 0, 0, 0, 0, 1, 0, 0],
[ 0, 0, 0, 0, 0, 0, 0, 0, 1]]

Essentially, with these vectors and the witness w, we’re able to express that the original computation was performed with an in value of 10 and pass this data into a proving system that can generate a zkSNARK proof that is easily verifiable by a verifier.

Quadratic Arithmetic Program (QAP)

Once the R1CS process is complete, the output is a series of vectors, which is then handed to the Quadratic Arithmetic Program (QAP), turning the output vectors into a system of quadratic polynomials. Once the data is in this polynomial form, it can easily be checked by any number of interactive and non-interactive proof systems.

Other arithmetizations: AIR

As an aside, another popular type of arithmetization (amongst many others) is the Algebraic Intermediate Representation (AIR), used in both SNARKs and STARKs. AIR has three main components:

  1. a 2D matrix of the execution trace, with the columns representing intermediate field elements of the execution, and rows representing the computation steps over time, from start to end.
  2. A set of constraint equations that define algebraic relationships between the columns and rows of the matrix.
  3. A set of boundary constraints that enforce constraints on inputs, outputs, and constants.

I won’t go into too much detail on AIR today, but we can revisit the topic at a later date.

Summary

Today, we discussed how to get from a program to a circuit, subsequently converting that circuit into a form that can allow Stay tuned!

Continue to Part 3

--

--