Halo and Accumulation

Published in

Veridise

9 min readAug 10, 2023

This is part of an article series on accumulation, ZK folding, and NOVA:
📓 Part I: Recursive SNARKs and Incrementally Verifiable Computation (IVC)
📓 Part II: Halo and Accumulation
📓 Part III: Nova and Folding (1/2)
📓 Part IV: Nova and Folding (2/2)

In our previous article, we introduced the idea of recursive proof composition, where a SNARK (an outer SNARK) is used to prove that the proof provided by another SNARK (an inner SNARK) checks out. A short proof is provided for the computation of the verifier circuit of the inner SNARK on the corresponding proof. This can be used in incremental verifiable computation (IVC) where we have a sequence of computation steps and we want to verify the correct execution of all of them. Rather than providing a proof for the whole computation, at each step we produce a SNARK that proves that the step was done correctly on the output of the previous step, and that the verifier circuit of the SNARK from the previous step accepted the proof from the previous step.

How efficient is recursive proof composition?

Let’s start by having a closer look at the efficiency of this scheme.

As the statement to be proven involves the circuit of the verifier itself, the size of the statement to be proven can easily explode. If we employ a SNARK where the verifier circuit is linear in the size of the statement to be proven (say it is a constant multiple 𝒸), as each verifier will make it to the statement of the next step, the proving workload will grow exponentially. More precisely, for a given statement the verifier size will be a multiple by a factor of 𝒸.

The next step will however include the proof of the statement that the verifier checks out, so the verifier circuit will be part of the next statement to be proven, multiplying the size by yet another factor of 𝒸 for its verifier circuit. It in turn will have a verifier circuit of size growing by a factor of 𝒸, etc.

To make this feasible and avoid exponentially growing verifier circuits, we need to ensure that the verifier circuit is truly smaller than the computation it asserts. We call these “fully succinct”. This, however, puts serious restrictions on the SNARKS we can use and the security assumptions they come with.

Forcibly, we have to accept all requirements dictated by these SNARKS, like having a trusted setup as required by SNARKs with sub-linear verification time. A trusted setup is a procedure that is performed once to generate a piece of data shared by the prover and the verifier. That setup will subsequently be used each time a proof is generated/verified. It is cumbersome, but there is no way around it, as the SNARKS that allow non-exponentially growing IVC require it.

Accumulate and accelerate

There is a way to avoid the exponential growth while still being able to choose our SNARKs from a larger repertoire.

The idea is to defer the difficult part of each step to a later point, in a way that we accumulate all the deferred parts to a single instance of the difficult part, which however does not grow with each step.

So rather than doing n times heavy work for n steps, we do it only once for all of the steps.

Let’s start by identifying the computationally heavy part of the verification step. In most SNARKs this is the checking of opening claims in polynomial commitment. So next is a quick introduction to polynomial commitments.

Polynomial commitments

Cryptographic tools allow us to commit to data. This means that we can create short commitments from a possibly huge amount of data and share with another party.

These commitments are binding, in the sense that after we have shared the commitment, we cannot change the data from which the commitment was obtained. Any change of the data would result in a different commitment and hence would be detected by the other party. To be precise, as we are taking a huge amount of data (so the total space of possibilities is large) and map them to short commitments (for which the total space of possibilities is small), by the pigeonhole principle there is bound to be some collision (two different sets of data give rise to the same commitment). But given the data and the corresponding commitment, the task of coming up with some other data having the same commitment should be computationally infeasible.

In addition, we might want the commitment to be hidden. Explained simply, this means the other party should obtain no information about the data for which the commitment was obtained just from the shared commitment value. Think of this as putting a message in a sealed envelope. The content of the envelope cannot be changed (binding) and not be seen by the other party (hidden), but it can be revealed on demand.

A simple example of a commitment scheme would be to use a collision resistant hash function (like SHA256), obtain a hash value for the data and share it with another party. Because the hash function is collision resistant, it is computationally infeasible to come up with a data block having the same hash value. If we want to have the hiding property as well, we can add some randomness to the data block before hashing.

Suppose that we want to commit to a polynomial. We could just take a hash of the coefficients of the polynomial. Polynomials, however, have more structure than arbitrary blocks of data; we can for instance evaluate them at a given point. We might want this additional structure to be reflected in the commitment to the polynomial as well. Given a polynomial and a commitment to it, we want to have the additional feature of opening the commitment at a certain point. In other words, we want to be able to convince the other party that the polynomial we have committed to evaluates to a given value at a certain point. We are in fact proving to the other party that “there is a polynomial with the given commitment evaluating to a given value at a given point”. This is called a polynomial commitment.

Another thing we can do with polynomials is to add them, or more generally to multiply them with constants and then add them (to take linear combinations). The space where the commitment values live might not have this property — it might not be possible to add to such values. But when it does, we might ask that this is also reflected in the commitments: we want the commitment to a sum (linear combination) of polynomials to be the sum (linear combination) of the commitments. So given polynomials, we can first add them up and then calculate the commitment, or we can first take the commitments and then add them up. In both cases we should end up with the same result. A polynomial commitment having this property is said to be additively homomorphic.

Polynomial commitments play a crucial role in many SNARKs. In particular, opening claims of polynomial commitments have to be checked by the verifier, which constitutes the major part of the verification circuit. This is in fact the heavy computation we mentioned above.

Say Hello to Halo

This idea of speeding up recursive composition by allowing the verifier to defer the linear time polynomial commitment opening checks, and accumulating these in one single such operation at the very end has been introduced by Bowe, Grigg and Hopwood in Halo.

Halo is the first practical example of recursive proof composition without a trusted setup, using the discrete log assumption over normal cycles of elliptic curves.

Not having to perform the linear time computation at each verification step allows us to draw from a wider variety of SNARKs, even ones with bad verification performance. In particular, we are now able to use SNARKs that do not require a trusted setup, and do so efficiently!

In Halo the task of checking an opening claim for a polynomial commitment is turned into a two step procedure:

A fast one that is outputting a pair consisting of a polynomial and a commitment to it; and
An expensive step checking the polynomial-commitment pair.

So to verify an opening, we need to create the pair (fast) and check it (expensive). The good news is that the expensive step can be easily accumulated if the commitment scheme is additively homomorphic. Suppose you have several instances of step 2, so you have several pairs (fᵢ, Cᵢ) of polynomials fᵢ and their commitments Cᵢ.

Rather than verifying each of the pairs separately (check that C₁ is a commitment for f₁, C₂ is a commitment for f₂, etc.), we can take a random linear combination of the polynomials and the same linear combination of the commitments. If all of the commitments check out, then as the polynomial commitment scheme is additively homomorphic, the linear combination of the commitments will be the commitment of the corresponding linear combination of polynomials.

If on the other hand at least one of the polynomial-commitment pairs does not check out, then with very high probability the same will hold for the linear combination. So checking $$random linear combination of all pairs to be checked guarantees that all involved pairs are valid with high probability.

This is a cool trick used quite frequently. To check several instances, we check a random linear combination. If it checks out, we can be sure with very high probability that the same holds for all of the instances.

A technical side note: it is crucial where the randomness comes from. The prover should have no control on the randomness, as by taking appropriately chosen (not random!) coefficients, they might be able to forge proofs of wrong statements. Hence either the randomness is provided by the verifier in an interactive protocol, or a Fiat-Shamir transform is applied to render the protocol non-interactive where the randomness is obtained using a hash of the whole transcript until that point for example.

To summarize, at each step we only do the easy part. The expensive part is accumulated in a running aggregate and its verification is deferred to a later point — all expensive parts are verified together at the end, at the cost of one such check.

So at each step, rather than proving that we did the whole verification of the proof that was passed on from the previous step, we prove that the verification holds under the assumption that the difficult step holds (so the proof holds conditioned on the veracity of a statement of the form “Cᵢ is a commitment to the polynomial fᵢ ”).

These difficult statements are accumulated and will be proven all at once at the end. As long as we also include a proof that the accumulation was done properly, we are good to go! Although the verification circuit has a computationally heavy part, as we do this only once for a potentially large number of steps combined, the cost is amortized. That is how we can afford using a SNARK with an expensive verifier, which on the other hand allows us to get by without a trusted setup and simpler security assumptions.

Nova takes this idea to the maximum: rather than some part, the entire verification step is deferred. This is achieved by the use of a newly introduced primitive called a folding scheme — an elegant way to reduce the verification of two steps to the verification of a single step.

At each step, the new statement to be verified is folded into a running statement using the folding scheme. It suffices to verify two things first that the final statement resulting from all the repeated foldings checks out and second, that the folding at each step was performed correctly.

The first is done only once at the end. The second is cheap, so the overhead it incurs at each step is pretty small.

Probably this is a good point to stop for now. As the stage is all set, we can start talking about our protagonist in detail in our next blog post.

Read the other articles in the series:
📓 Part I: Recursive SNARKs and Incrementally Verifiable Computation (IVC)
📓 Part II: Halo and Accumulation
📓 Part III: Nova and Folding (1/2)
📓 Part IV: Nova and Folding (2/2)

To make sure you don’t miss the next article from this exciting series, connect with us:

Twitter | Lens Protocol | LinkedIn | Github | Request Audit