Under the hood of zkSNARK Groth16 protocol (part 1)

Crypto Fairy
Coinmonks
Published in
6 min readSep 10, 2023

--

Zero Knowledge Proofs (ZKP) is a concept that has garnered much attention in the blockchain industry. Many, including myself at one point, view it as a mysterious “black box”. In this article, and the subsequent series, I will be sharing what I have learnt about one of ZKP protocols and one in particular — zkSNARK Groth16.

This series will be supplemented with working code, which can be accessed on GitHub.

But first, let’s establish what we mean by zero-knowledge proofs. ZKP is a protocol that typically involves two parties: Alice and Bob. Here’s the gist of it: Alice wants to convince Bob that she possesses a specific piece of information without actually revealing any details of that information to him. Moreover, forging such a proof is nearly impossible, unless you have a quantum computer in your pocket.

To illustrate this with an example, imagine Alice needs to share evidence of her bank account’s existence with a governmental organization. Using ZKP, she can prove she owns a bank account without disclosing which bank she uses or the amount of money in her account.

Another potential application for ZKP is in verifying specific computational processes. Let’s say there’s software that conducts forensic analysis tests. With ZKP, we can ensure that these tests were genuinely carried out, their results can be verified, and it’s nearly impossible to falsify them.

zkSNARK

Here is an image that depicts the zkSNARK setup. Unlike other ZKP protocols such as STARK and Bulletproofs, zkSNARK requires a third party during the initialization phase. This third party provides the prover and verifier with their respective keys.

The “prover” and “verifier” blocks are programs that reside on the prover’s (Alice’s) and verifier’s (Bob’s) sides, respectively. One cool aspect of the SNARK Groth16 protocol is that the verifier can be integrated into a smart contract deployed on the Ethereum blockchain. This introduces a new suite of applications to the platform and simplifies the adoption of blockchain technology in everyday life.

Lets begin with prover.

Prover

As I mentioned earlier, both the prover and verifier are programs, each run and hosted by different parties. Comparing the two in terms of computational requirements and code complexity, the prover is undoubtedly the more sophisticated of the two. Consequently, the execution time will tend to be greater on the prover’s side.

The execution time for the prover will largely depend on the program’s complexity, scaling linearly as O(n). In contrast, the verifier’s code complexity remains constant at O(1), ensuring stable verification times regardless of the statement that the prover tries to prove.

You may have seen the diagram below, which depicts the steps involved in proof generation. I’ll provide an overview of the first two steps without delving too deeply, while the subsequent steps will be illustrated with code examples and explanations.

Code

Since ZKP exists in the realm of cryptography, it implies limitation on a set of programs that can be implemented using ZKP protocols. Let’s consider an example where the prover needs to run the following program:

def compute(x, y):
a = 5*x**3 + x**2
b = 4*x**2*y**2 + 13*x*y**2 - 10*y
if y == 0:
return a
else:
return a + b

In general, this function might not seem particularly meaningful, but for our purposes, it’s sufficient. As you can see, the function contains an “if” statement. This should hint at the possibility that the code might contain other programming constructs, like loops, for instance. However, in the case of SNARKs, computations must be bounded. This means the number of iterations must be known in advance, and the number of parameters, as well as their size, should also be predetermined.

If you’re questioning the range and complexity of tasks that can be handled by zkSNARK, here’s a small snippet of code I wrote a couple of years ago: https://gist.github.com/tarassh/cb2a48966fa5db3c803cba134d00b0e4. This is an adapted version used to calculate the SSIM metric from the h264 video codec.

Returning to our example: as previously mentioned, ZKP operates within the realm of cryptography, and consequently, in the domain of mathematics. Thus, this code must be converted into an arithmetic equation. There are several libraries capable of performing this conversion, including:

  • Zokrates: The code is written in a language resembling Python.
  • Circom: It uses its own domain-specific language.
  • Pequin from the Pepper-Project: This utilizes a C-like language, similar to the gist provided above.

Bypassing the details of how the code is converted into an equation, the example provided earlier can be represented by the following equation:

out = 5*x**3 - 4*x**2*y**2 + 13*x*y**2 + x**2 - 10*y

Or:

The greater the complexity of your computation, the lengthier the resulting equation will be. This equation ultimately serves as the foundation for constructing an algebraic circuit.

Algebraic Circuit

Now, we need to break down our equation into smaller, sequential operations, adhering to the following rules:

  • Each operation must consist of two input parameters and one output parameter.
  • Only variables (like x and y in our case) can serve as input parameters. Constants must be associated with one of the input parameters..
  • Multiple addition operations can be consolidated into a single variable. Additionally, subtraction can be represented as an addition: ab is equivalent to a+(−b).

So our equation can be split into 5 constrains. Here is a list of steps of what have happened here:

First, we can replace with a new variable, v1​, and accept two input parameters: x and x. Now equation will look next:

Next we can replace y² with a new variable, v2, and equation will look like this:

And so on… Notably, if you choose a different pair of input parameters, say v1​=xy, you’ll introduce two additional constraints. The primary objective here is to identify the fewest constraints that can accurately represent this equation.

Schematically, this can be visualized as a circuit. It’s important to note that addition operations can be consolidated, as exemplified by the formula out=1×(v3+v1−10yv4+v5), where v5=13xv2 is a new variable. However, for sake of optimisation and reasons explained in the subsequent article, the formula needs to take the following form:

--

--