Nova and Folding (1/2)

Veridise
Veridise
Published in
13 min readSep 14, 2023

This is part of an article series on accumulation, ZK folding, and NOVA:
πŸ““
Part I: Recursive SNARKs and Incrementally Verifiable Computation (IVC)
πŸ““
Part II: Halo and Accumulation
πŸ““
Part III: Nova and Folding (1/2)
πŸ““
Part IV: Nova and Folding (2/2)

As we made it this far, we are now ready to dive deep into folding and Nova. Note that at first this part might seem a bit technical. The main hurdle is however the notation. Once we tackle that, all that remains is to enjoy the elegant folding idea.

First a short recap of what we covered until now.

A short recap

We started by introducing SNARKS, which allow an untrusted prover to convince a verifier that it β€œknows” a witness 𝑀 satisfying certain properties. The witness could, for instance, be the input and the execution trace of a computation; the proof would testify to the correctness of the computations and hence its output. And we can do this without revealing anything about the witness (input and trace) itself, in zero knowledge.

We have seen how SNARKS can be composed, where the verification itself does not need to be performed, but can be outsourced to yet another SNARK providing us with a proof that it checks out. Building on this, we can realize incremental verifiable computation (IVC). In it, in a sequence of steps in a computation, each step not only takes the output of the previous step, performs its operations on it and feeds the results to the next step as input, but also takes a proof from the previous step (asserting all steps up to and including the previous step are correct). It also provides the next step with a proof that both the proof it was provided from the previous step verifies, and that the step itself was performed correctly.

At each step, we are adding the verifier circuit for the proof from the previous step to the statement to be proven next. This would either cause a blow up by a scalar factor (hence exponential growth of circuits), or we have to resort to SNARKS requiring a trusted setup. Accumulation schemes provide a way out by allowing us to defer an expensive part of each verification step to a later point. This is done by accumulating them in a running aggregate, which will be verified at a later point at the cost of one such check.

As mentioned in our last blog post, Nova takes this to an extreme by the use of an ingenious new primitive called a folding scheme. This allows us to fold total proofs instead of only accumulating expensive parts. It is a very elegant way to reduce the verification of two steps to the verification of a single step.

Before we go on further, we first need to clarify what we mean by a β€œstep”.

What is a computation?

For a given computation 𝐢, we claim that we know an input such that after a correct execution of the computation we obtain a certain public output 𝒴. We will refer to the input and the execution trace (which covers the notion of correct execution) as the witness 𝑀. In addition to the witness there might be some other public input x fed into the computation. So given public input x and output 𝒴, we want to prove that we know a witness 𝑀 completing the picture.

The computation could be a sequence of arithmetic operations (addition, multiplication, etc.) or involve more complicated constructs like conditional statements. We have to define what we mean by a computation and how to represent it.

The easiest representation would be in pseudo-code or some high-level programming language (Python, C++, etc.). This, however, does not work well for what we plan to do with it afterwards, namely to obtain a succinct non-interactive proof for it. As it turns out, we can always translate computation into an arithmetic satisfiability problem. So given public input x and public output 𝒴 (you can think of these as vectors of length n and m over a finite fields, say sequences of elements from integers modulo a prime p), and a system of m polynomials in n + k variables 𝐢, we want to prove that we know a witness 𝑀 (a vector over a finite field of length k), such that

𝐢 (x, 𝑀) = y

Evaluating the set of m polynomials 𝐢 at the n + k-tuple formed by x = (x₁,…, xβ‚™) and 𝑀 = (𝑀₁,…, 𝑀ₖ), we obtain the m-tuple 𝒴 = (𝒴₁,…, π’΄β‚˜). This representation is more amenable to the creation of proofs. There are many variations of amenable representation, known as intermediate representation. Below we will concentrate on one popular intermediate representation, known as a rank-1 constraint system (R1CS). These are the statements (computational steps) whose proofs will be folded into Nova.

Coming up with good intermediate representations and turning computer programs into these is an art by itself. For some computations this is immediate. For instance, if the computation is multiplying matrices, we just have to write the operations performed on the matrix entries as polynomials. For other computations, take conditional statements for example, this will be more tricky. Just to give you an idea, suppose we check a bit b and according to whether it is 0 or 1 we let a variables c be aΒ² or aΒ³:

if b == 0 then c = aΒ²;
else c = aΒ³;

In terms of polynomials we would rewrite this as

c = (1-b) * aΒ² + b * aΒ³

We will leave the details about turning computer programs to intermediate representations for another article. Alternatively, you can have a look at the wonderfully written post by Viatlik Buterin β€œQuadratic Arithmetic Programs: from Zero to Hero”.

Rank 1 Constraint Systems (R1CS)

Let’s come back to the definition of the intermediate representation R1CS (also known as QAP β€” quadratic arithmetic programs).

These are arithmetic circuits (polynomial equations) of a particular shape:

We combine the constant 1, the public input vector x = (x₁,…, xβ‚™), the public output vector 𝒴 = (𝒴₁,…, π’΄β‚˜) and the witness 𝑀 = (𝑀₁,…, 𝑀ₖ) and to a single vector

z = (z₁,…, zα΅£) = (1, x₁,…, xβ‚™, 𝒴₁,…, π’΄β‚˜, 𝑀₁,…, 𝑀ₖ)

Note: we require z₁ = 1 and r = 1 + n + m + k.

We only allow quadratic (degree 2) polynomials in z₁,…, zα΅£ as the set of polynomials defining the arithmetic circuit, hence the name quadratic arithmetic program. As we have moved the public output 𝒴ᡒ into the vector z, we set the right hand side equal to zero. Hence each of our polynomials in the system looks like

A β€’ z₁² + B β€’ z₁zβ‚‚ + … (all degree 2 terms)
+ a β€’ z₁ + b β€’ zβ‚‚ + … (all degree 1 terms) = 0
+const. (degree 0 term)

Now as z₁ = 1, we do in fact not require a constant (degree zero) term.

In rank 1 constraints systems allow only quadratic polynomials of a particular type. Namely, restrict the quadratic (degree 2) terms to expressions of the form

(a₁ β€’ z₁ + … + aα΅£ β€’ zα΅£) β€’ (b₁ β€’ z₁ + … + bα΅£ β€’ zα΅£).

This is in fact the reason for the name rank 1. Moving the linear terms to the right hand side, we obtain equations of the form

(a₁ β€’ z₁ + … + aα΅£ β€’ zα΅£) β€’ (b₁ β€’ z₁ + … + bα΅£ β€’ zα΅£) = c₁ β€’ z₁ + … + cα΅£ β€’ zα΅£

for coefficients a₁, …, aα΅£, b₁, …, bα΅£, c₁, …, cα΅£. Most of these coefficients will be assumed to be zero, so the polynomials are sparse.

Note that this is only one of the polynomials. To define the arithmetic circuit we allow several polynomials of this form:

(aα΅’,₁ β€’ z₁ + … + aα΅’,α΅£ β€’ zα΅£) β€’ (bα΅’,₁ β€’ z₁ + … + bα΅’,₁ β€’ zα΅£) + (cα΅’,₁ β€’ z₁ + … + cα΅’,₁ β€’ zα΅£) = 0

for various values of i.

To keep things simple, we will refrain from passing to double indices and a matrix notation, but keep in mind that we have several equations of this form!

Notation and terminology, instances and witnesses

A bit of terminology and notation before we move on: A particular choice of public input and output (a choice of x₁,…, xβ‚™, 𝒴₁,…, π’΄β‚˜) is called an instance (we will sometimes denote it by I) and this instance is said to be satisfied by a corresponding witness W in case all quadratic equations hold.

So we will write the vector z in the form (1, I, W). The coefficients aα΅’, bβ±Ό, cβ‚– are said to define a structure, the variables corresponding to input/output can be thought of as parameters giving various instances. Rather than writing out the linear combinations (and remember there are several of these, one for each equation), we will denote

aα΅’,₁ β€’ z₁ + … + aα΅’,α΅£ β€’ zα΅£, bα΅’,₁ β€’ z₁ + … + bα΅’,₁ β€’ zα΅£ and cα΅’,₁ β€’ z₁ + … + cα΅’,₁ β€’ zα΅£ by A(Z), B(Z) and C(Z) respectively. So A takes a vector Z and maps it to a vector corresponding to the various linear combinations of the equations. If you like matrix notation, you can think of A as the matrix for which each row is given by the coefficients (a₁, …, aα΅£) for one of the equations and A(Z) is in fact multiplication of this matrix by the column vector of the zα΅’. As A consists of a bunch of applications of linear combinations, it is linear, i.e. we have

A (u β€’ Z + v β€’ Z’) = u β€’ A(Z) + v β€’ A(Z’)

for scalars u, v and vectors Z, Z’, similarly for B and C.

Now A(Z), B(Z) and C(Z) are vectors (made up of various linear combinations), we can rewrite the equations as

A(Z) β—‹ B(Z) = C(Z)

where A(Z) β—‹ B(Z) denotes the component-wise product of the vectors A(Z) and B(Z), i.e. it is a vector, whose i-th entry is the product of the i-th entry of A(Z) and the i-th entry of B(Z). By the way, this operation is also known as the Hadamard product.

Finally, if we want to separate Z into the instance I and witness W parts, we rewrite this as

A(1, I, W) β—‹ B(1, I, W) = C(1, I, W)

For IVC, as we iterate the same function F, at each step the structure will remain the same, so the coefficients aα΅’, bβ±Ό, cβ‚– for each of the polynomials for each step will be the same. The input and output will change, giving us different instances.

Each step of the computation will be a system of quadratic polynomial equations of the above type. The coefficients will be the same, however the input and output will change from step to step, giving us a new instance I each time , which will have a witness W, for which we have to provide a proof.

Now that we know what the steps are, we can finally talk about how to combine (fold) them together.

The core idea: folding

Say we want to fold two R1CS instances with the same structure. Let us unwind this a bit.

Suppose we have represented the computation in one step by an R1CS structure, i.e., by a system of quadratic equations of a particular form:

A(Z) β—‹ B(Z) = C(Z)

Having two steps of this computation (with different input and output) means we have two instances of this. So there are instances I₁ and Iβ‚‚, for which the prover has witnesses W₁ and Wβ‚‚. The prover wants to convince the verifier, that it knows witnesses W₁ and Wβ‚‚ such that the following two systems (each verifying one step, corresponding to a different instance, i.e., input-output) are satisfied:

A(1, I₁, W₁) β—‹ B(1, I₁, W₁) = C(1, I₁, W₁) and A(1, Iβ‚‚, Wβ‚‚) β—‹ B(1, Iβ‚‚, Wβ‚‚) = C(1, Iβ‚‚, Wβ‚‚).

Now the idea of folding is that the prover should be able to to come up with a new instance I and a corresponding witness W (for the same R1CS structure, i.e., for the same A, B, C), such that if it can convince the verifier that it knows a witness W for the instance I, then the verifier will be convinced that the prover knows witnesses W₁ and Wβ‚‚ for the instances I₁ and Iβ‚‚. So in some sense the prover comes up with imaginary input and output to the same computation (instance I) such that if it can convince the verifier that it has some witness and correctly performed the computation step in a way to connect the artificial input to the output, the verifier will be convinced that the same holds for the two initial real input-output pairs.

If life were easy…

To understand the idea behind folding, for simplicity let’s assume our equations are linear and there is only one of them (if there are several, just apply the same reasoning to each one of them). So we have an equation of the form

a₁ β€’ z₁ + … + aα΅£ β€’ zα΅£ = c₁ β€’ z₁ + … + cα΅£ β€’ zα΅£

or, collecting all terms on one side

D(z) = d₁‒ z₁ + … + dα΅£ β€’ zα΅£ = 0 for some d₁, …, dα΅£.

Again, for simplicity assume d₁ = 0, that instances have length 2 (input and output of length one), so they are of the form (x, y) and witnesses have length 1, so they consists of one field element 𝑀. So r = 4. Suppose there are two instances (x, y) and (x’, y’) with corresponding witnesses 𝑀 and 𝑀’, so that

dβ‚‚ β€’ x + dβ‚‚ β€’ y + d₃ β€’ 𝑀 = 0 and dβ‚‚ β€’ x’ + dβ‚‚ β€’ y’ + d₃ β€’ 𝑀’ = 0

We want to fold those two instances into a single one, so that a prover that is capable of providing a proof of knowledge of a witness for the folded instance would convince the verifier that it can do that for the two initial instances as well. For this we exploit the power of randomness: the verifier will challenge the prover for a proof that it knows a witness for a random linear combination of the two instances. So the verifier will be sending a random R and requests a proof from the prover that the it knows a witness for the input x + R β€’ x’ and output y + R β€’ y’, i.e., for the instance (x + R β€’ x’, y + R β€’ y’).

Why can an honest prover that knows witnesses 𝑀 and 𝑀’ solve the challenge?

Simply because of the linearity of the equations. The same linear combination of the witnesses will be a witness for the instance that is the linear combination of the instances. To see this, we would need to check that

dβ‚‚ β€’ (x + R β€’ x’) + dβ‚‚ β€’ (y + R β€’ y’) + d₃ β€’ (𝑀 + R + 𝑀’) = 0

Reordering, this turns into

dβ‚‚ β€’ x + dβ‚‚ β€’ y + d₃ β€’ 𝑀 + R β€’ (dβ‚‚ β€’ x’ + dβ‚‚ β€’ y’ + d₃ β€’ 𝑀’) = 0 + R β€’ 0 = 0

Or, as D(Z) is linear, we have

D((x, y, 𝑀) + R β€’ (x’, y’, 𝑀’)) = D(x, y, 𝑀) + R β€’ D(x’, y’, 𝑀’) = 0 + R β€’ 0 = 0

This property is called completeness.

Why can’t a dishonest prover that does not know witnesses 𝑀 and 𝑀’ solve the challenge?

Again because of linearity! If the prover could answer random challenges, then it could do so for several instances (for several random R₁, R₂…). Then from several of the corresponding witnesses 𝑀₁, 𝑀₂,… the prover could find witnesses 𝑀 and 𝑀’. So from solutions to the challenges one can extract witnesses for the initial instances. Using this dishonest prover one can in fact hack the system!

This property is called knowledge soundness.

… but life is not always easy

We have seen how one can do folding in a nice linear world. Adding more input-output does not add much difficulty, only more notation.

Adding more equations can be handled similarly by treating all equations separately (all are linear). But what we can express using linear constraints is unfortunately very limited and quadratic equations (and in particular R1CS) are unfortunately not linear. The square of a sum is not equal to the sum of the squares (just ask Pythagoras!).

Moreover, if a₁ β‰  0, then we also have to keep track of what the variables z₁ is doing. In particular, z₁ = 1 and z’₁ = 1 gives z₁ + R β€’ z’₁ = 1 + R for the linear combination, which does not satisfy the condition for R1CS constraints.

Just relax!

But not all is lost. Kothapalli, Setty and Tzialla have shown how one can relax the definition of R1CS constraints (what they call relaxed R1CS) to salvage this.

We have

A(Z + R β€’ Z₁) β—‹ B(Z + R β€’ Z₁) = (A(Z) + R β€’ A(Z’)) β—‹ (B(Z) + R β€’ B(Z’)) =
A(Z) β—‹ B(Z) + RΒ² β€’ A(Z’) β—‹ B(Z’) + R(A(Z) β—‹ B(Z’) + A(Z’) β—‹ B(Z)).

So the problems are:

  • The coefficient RΒ² in the second term should have been R;
  • The mixed term R(A(Z) β—‹ B(Z’) + A(Z’)β—‹ B(Z)) should not appear; and
  • We would like relax the condition that the first term z₁ should be 1.

The relaxation is obtained by adding additional parameters to the instance. Namely, we allow a new factor u in front of the R1CS structure to absorb the additional R factor and an error (β€œslack”) vector to account for the mixed term.

Hence:

  • A relaxed R1CS structure consists of A, B and C as above usual;
  • The R1CS instance I consisting of public input x = (x₁,…, xβ‚™), 𝒴 = (𝒴₁,…, π’΄β‚˜) will be relaxed by adding a scalar u and an error vector E;
  • The vector Z is relaxed to (u, I, W), allowing u instead of 1 as the first entry; and
  • The structure of the equation is relaxed to A(Z) β—‹ B(Z) = u β€’ C(Z) + E.

Now if W₁ is a witness for the relaxed R1CS instance I₁, E₁, u₁ and Wβ‚‚ is a witness for the relaxed R1CS instance Iβ‚‚, Eβ‚‚, uβ‚‚, we can indeed take the linear combination I = I₁ + R β€’ Iβ‚‚ of the instances I₁ and Iβ‚‚ and it will be satisfied by the witness W = W₁ + R β€’ Wβ‚‚ if we define u and E for the instance I appropriately. For Z = (u₁, I₁, W₁) and Z’ = (uβ‚‚, Iβ‚‚, Wβ‚‚) the correct way to obtain the new scalar and slack vector while folding is as follows:

u = u₁ + R β€’ uβ‚‚
E = E₁ + R β€’ A(Z) β—‹ B(Z’) + A(Z’) β—‹ B(Z) β€” u₁C(Z’)β€” uβ‚‚C(Z)) + RΒ² β€’ Eβ‚‚.

Using the slack vectors, we have made the particular quadratic expressions (R1CS) behave as if they were linear. We leave it to the reader to check that the folding really works. For further details, completeness and knowledge soundness, we refer to the original paper.

Probably this is more than enough to digest in a single blog post. Next time we will see how this is all efficiently realized using commitments, how it is implemented and continue with the story after the Nova paper. Until then, relax and enjoy folding!

Read the other articles in this series:
πŸ““
Part I: Recursive SNARKs and Incrementally Verifiable Computation (IVC)
πŸ““
Part II: Halo and Accumulation
πŸ““
Part III: Nova and Folding (1/2)
πŸ““
Part IV: Nova and Folding (2/2)

To make sure you don’t miss the next article from this exciting series, connect with us:

Twitter | Lens Protocol | LinkedIn | Github | Request Audit

--

--

Veridise
Veridise
Editor for

Hardening blockchain security with formal methods. We write about blockchain & zero-knowledge proof security. Contact us for industry-leading security audits.