Code Talks: The A-to-Z on zkSnarks and Zero-Knowledge Proof
The A-to-Z on zkSnarks and why zero-knowledge proof is ideal for authentication so no one else can knows what you’re communicating.
PPIO Code Talks is an open platform for high-quality presentations and discussions on blockchain technology with the aim of engaging the community and spreading ideas. Previous Code Talks focused on two cutting edge technologies: Libra & PPIO: The Genius of Libra and What We’ve Learned and Tendermint: The Inventiveness of their Consensus Mechanism. The following is an adaption of a slideshow presentation given on July 27 by Star Li, CEO of Trapdoor, on the subject of Zero-Knowledge Proofs. Talks were also given by other prominent members of the Shanghai blockchain and internet industry.
Before introducing the technical details of zero-knowledge proof, let us review the history of zero-knowledge proof.
- 1985: Zero-Knowledge Proofs [GMR85]
- 1992: Succinct ZK[K92]
- 2013: Pinocchio (PGHR13)
- 2016: Groth16
- 2017: Bulletproofs (BBBPWM17)
- 2018: zk-STARKs (BBHR18)
Zero-knowledge proofs can be traced back to the original paper of Zero-Knowledge Proofs [GMR85] in 1985. Subsequently, a simplified ZK [K92] proof was presented in 1992. By 2013, zero-knowledge proof could be used in real life, but at a slower pace. In 2016, Groth proposed the Groth 16 algorithm, which greatly reduced computational complexity. Since then, zero-knowledge proofs have gradually been put into real commercial use. Subsequently, Bulletproofs and zk-STARKs were introduced, which alongside Groth 16, are considered to be the three current mainstream proof protocols.
What Is A “Zero-Knowledge Proof”?
If you work in blockchain development, you may have often heard of the concept of “zero-knowledge proof”. So what exactly is it and how can it be utilized?
A way to understand the zero-knowledge proof is the story of Alibaba.
Alibaba was caught by robbers. In order to save his life, he needed to prove to the robbers that he had the password to open a stone door, while at the same time, he could not tell the robbers the password. He came up with a solution: the robbers would give enough space so they couldn’t hear the password to open the door but close enough to fire an arrow if Alibaba tried to escape. From this distance, Alibaba showed the robbers the opening and closing of the stone door.
This whole process of the zero-knowledge proof is that the prover must convince the verifier that an assertion (Alibaba knows how to open the stone gate) is correct without providing any useful information to that verifier (the password of the stone gate).
zk-SNARK is an abbreviation of zero-knowledge Succinct Non-interactive ARguments of Knowledge
- Zero Knowledge：Provers do not disclose the private information they wish to prove.
- Succinct：The amount of data proved is relatively small
- Non-interactive：Little or no interaction
- ARguments：The verifier is only protected against computationally limited provers. Provers with enough computational power can create proofs/arguments about wrong statements (Note that with enough computational power, any public-key encryption can be broken). This is also called “computational soundness,’’ as opposed to “perfect soundness”.
- of Knowledge: It is not possible for the prover to construct a proof/argument without knowing a certain so-called witness.
How Does the Zero-Knowledge Proof Work?
If we want to use zk-SNARK zero-knowledge proof, we must first form a calculation of what we will prove, express it with the arithmetic circuit, and finally, generate QAP. The most important logic of zero-knowledge proof is QAP.
The workflow of the zero-knowledge proof is divided into four steps:
- The computational problem to be proved is transformed into arithmetic circuits.
- The arithmetic circuits is converted to R1CS
- R1CS is converted to QAP
- The implementation of zkSNARK algorithm based on QAP
The first three steps can be completed in advance, whereas the real proof process is mainly in step 4.
What Is QAP?
QAP is an abbreviation of Quadratic Arithmetic Program. Before understanding QAP, let’s first understand what a P problem is.
The P problem, which is what we usually say, is solvable in polynomial time. The opposite is the NP problem.
NP problem, that is, it is not solvable in polynomial time, but given a solution, it can be verified in polynomial time whether the solution is correct.
The NPC problem, which is the core of the NP problem, can be attributed to an NPC problem. If the NPC problem of an NP problem is solved, it means that the original NP problem is also solved.
The QAP problem discussed here is actually an NP problem. The QAP problem can verify whether a solution is correct in polynomial time, however, it is difficult to arrive at a correct solution in polynomial time with limited resources in a limited time.
How to verify the solution of a QAP problem?
The problem involves three sets of polynomials:
[v0(x) , v1(x), v2(x), … , vm(x) ]
[w0(x) , w1(x), w2(x), … , wm(x) ]
[y0(x) , y1(x), y2(x), … , ym(x) ]
And a target polynomial
t(x)=(x-1)(x-2)…(x-d)，(Where d is the number of circuit gates)
If given witness
u = [a1, a2, … , am]
To make a polynomial
(v0(x) + a1v1(x) + a2v2(x) +…+amvm(x))·(w0(x) + a1w1(x) + a2w2(x) +…+amwm(x)) — (w0(x) + a1y1(x) + a2y2(x) +…+amym(x))
If u is divisible by t(x), then u is a solution of QAP problem.
So how do you turn a common computational problem into a QAP problem?
First, we will flatten the computational problem into circuits, such as x3 + x + 5, which we can flatten into four circuits.
- sym1 = x * x
- y = sym1 * x
- sym2 = y + x
- out = sym2 + 5
Sym1, sym2, and y are the temporary variables used in the whole calculation process, while out is the final output of the calculation process.
Next, we convert the circuits to the R1CS form.
Let’s first put all the variables in the circuit into a vector s, and add a constant one, which is 1.
s = [one, x, out, sym1, y, sym2]
Then we can change each gate. For example, for the gate sym1 = x * x，We can think of it as x * x — sym1 = 0
x = [one, x, out, sym1, y, sym2] · [0, 1, 0, 0, 0, 0] = s · [0, 1, 0, 0, 0, 0]
sym1 = [one, x, out, sym1, y, sym2] · [0, 1, 0, 0, 0, 0] = s · [0, 0, 0, 1, 0, 0]
Where sym1 = x * x，We can express it as:
(s · [0, 1, 0, 0, 0, 0]) * (s · [0, 1, 0, 0, 0, 0]) — s · [0, 0, 0, 1, 0, 0] = 0 （I）
Similarly, we can express the other three gates in this form. They are:
- y = sym1 * x
- sym2 = y + x
- out = sym2 + 5
(s · [0, 0, 0, 1, 0, 0]) * (s · [0, 1, 0, 0, 0, 0]) — s · [0, 0, 0, 0, 1, 0] = 0 （II）
(s · [0, 1, 0, 0, 1, 0]) * (s · [1, 0, 0, 0, 0, 0]) — s · [0, 0, 0, 0, 0, 1] = 0 （III）
(s · [5, 0, 0, 0, 0, 1]) * (s · [1, 0, 0, 0, 0, 0]) — s · [0, 0, 1, 0, 0, 0] = 0 （IV）
We use the three vectors of the above four formulas (I), (II), (III), and (IV) to denote A, B, and C respectively.
We can see that the values of vectors of A, B and C are only related to the computational problem, but not to the solution of the problem. At this time, the A, B, C vector group constitutes R1CS.
Further, we can number the gate circuit: the first gate number is 1, the second gate number is 2, and so on. The vector group A is regarded as a vector function A(x) = [A0(x), A1(x), A2(x), A3(x), A4(x), A5(x)], where the number of the gate X.
Then, A0(x) is equivalent to a three-degree polynomial passing through (1,0), (2,0), (3,0), (4,5). How do we get this polynomial? The answer is the Lagrange interpolation.
With it, we can find A0(x) = 0.833x3–5x2 + 9.166x — 5.
Similarly, we can find out A1(x), A2(x),…, B0(x), B1(x), B2(x),…, C0(x), C1(x), C2(x),…
So we get the A, B, C vectors for the x polynomial
A(x) = [A0(x), A1(x), A2(x), A3(x), A4(x), A5(x)]
B(x) = [B0(x), B1(x), B2(x), B3(x), B4(x), B5(x)]
C(x) = [C0(x), C1(x), C2(x), C3(x), C4(x), C5(x)]
Evidentially, if P(x) = (s · A(x))*(s · B(x)) — s · C(x), the polynomial is 0 when x=1, 2, 3, 4, then it means that all four gate circuits are valid, which is equivalent to proving that s is the solution of the original problem.
When x=1, 2, 3, 4, the polynomial P(x) = (s · A(x))*(s · B(x)) — s · C(x) is always 0, which also means P(x) can be divisible by (x-1)(x-2)(x-3)(x-4)
Therefore, the prover only needs to prove that he owns an s such that P(x) is divisible by (x-1)(x-2)(x-3)(x-4).
In the actual proof process, you don’t need to test all the x points, but to randomly check just one point.
What Is t(x)?
As previously stated: t(x) = (x-1)(x-2)…(x-d), where d is the number of gates.
In 2016 the Groth16 algorithm was proposed. The algorithm greatly reduces the amount of data proved, and also greatly improves the speed of verification. From PPT, we can see that the validation process only needs to validate an equation, so the validation is very fast.
In Groth16, it is still necessary to provide three values: A, B, and C. The Groth algorithm have proved mathematically that at least two values can be used as proof. However, due to the huge amount of calculation, the three values is more practical.
The Groth16 algorithm involves a lot of cryptography, including elliptic curves, homomorphic secret sharing, bilinear mapping, etc.
Why does zero-knowledge proof achieve non-interactivity?
The principle is to use the CRS (Common Reference String). In fact, CRS pre-generates the random number and the number of challenges to be generated during the challenge process. It then generates their corresponding homomorphic hidden features used in the proof and verification process based on these random numbers and challenge numbers.
After that, the random numbers and challenge numbers are destroyed. These random numbers and challenge numbers are called toxic waste and are labeled as such because if they are not destroyed, they can falsify proof.
In Zcast project, in order to generate these random numbers and challenge numbers safely, we use MPC technology.
After reading this article do you have a better understanding of “Zero-Knowledge Proofs? If you still want to know more you can join us on our Discord and we’ll answer all your questions.