A Simple Explanation of Zero Knowledge Proofs
Zero Knowledge proofs are easy to understand. They are based on a line of reasoning that we commonly use in our day to day life. It is the same line of reasoning we use when we make assumptions about our surroundings.
If you sit in a room with no windows and hear the pitter-patter of rain on the roof, you assume that it’s raining even if don’t see the rain. When you come home from work and your roommate’s car is in the driveway, you assume your roommate is in your house even if you don’t see them in their physical form.
This type of thinking is integral to our day-to-day. It is also the core principle behind zero-knowledge proofs. In a zero-knowledge proof, a prover wants to convince a verifier of some public statement. However, the proof of this statement holds sensitive details that can’t be shared. To get around this, the prover creates an indirect proof. If the indirect proof is true then so is the direct proof. If the indirect proof leaks none of the aforementioned sensitive details, then it is a zero-knowledge proof.
Suppose your roommate, whom we’re going to name Barry, is hiding from you because he is introverted and doesn’t want to make small talk. But Barry doesn’t want you to blast music, which you often do when you’re alone. So he finds a different way to convince you that he is home. He leaves his car in the driveway and his jacket on the couch. Now you assume he is home even though you haven’t seen him. He has just given you a zero-knowledge proof of his presence.
Using similar reasoning, I am now going to describe a simple zero-knowledge proof protocol from the paper, “How to Prove a Theorem So No one Else Can Claim It” by Manuel Blum , 1986.
Justification for writing about this protocol
This protocol is very different from the zk-SNARKs protocols that are coming out today and being used in blockchain applications. zk-SNARKs are succinct, non-interactive, and often prove solutions over arithmetic circuits. Their succinctness requires the use of elliptic curves and their non-interactivity requires either a random oracle or a common reference string.
The protocol discussed here is simpler than all that. It doesn’t care about succinctness and it is explicitly interactive. Thus it avoids much of the complexity of zk-SNARKs. It also uses a graph, which is much easier to visualize than an arithmetic circuit.
My hope is that after reading this article, you will have a stronger intuition about the role zero-knowledge proofs play in your cryptography toolbox. And next time you find yourself drowning in a sea of elliptic curve pairings, you will think back to this simple protocol from the 80’s and find some steady ground.
A Simple Construction
First, let’s go over some terms related to Graphs.
- a Graph is composed of vertices and edges.
- A path is a sequence of edges.
- A cycle is a path that closes in on itself.
- A hamiltonian cycle is a cycle that touches each vertex in the graph exactly once. In a graph with n-vertices, a hamiltonian cycle is a cycle with n - vertices. Below are some images of small graphs, with hamiltonian cycles highlighted in red.
As you can see, each of these paths close in on themselves, making them cycles. Furthermore, these cycles touch each vertex exactly once, which makes them hamiltonian cycles.
In this protocol, a prover wants to convince the verifier that they know of a hamiltonian cycle (cycle with n vertices) in some public graph.
Imagine you deliver newspapers for a living. You have a list of all the houses you need to deliver to. If you think of a map as a graph, then the houses are vertices, and the routes between the houses are edges. You want to convince your boss that you have visited each house exactly once. But you don’t want your boss to know the order in which you visited the houses, in case she decides to pop up on you randomly.
Imagine Peter (the prover) is a paper deliverer and Veronica (the verifier) is his micromanaging boss. Peter does not want to show his route to Veronica. But he does want to prove that everyone got their newspaper (which can only happen if he visited every house once). To accomplish this, Peter needs to provide indirect proofs to Veronica for two things. First — that he knows of a secret valid paper route (which is the secret hamiltonian cycle), and second —that he acted honestly while convincing her of this.
If the protocol allows an honest Peter to prove that he knows of a secret hamiltonian cycle in the graph, then the protocol is correct. If the protocol also allows a slightly shady Peter to prove that he acted honestly, then the protocol is sound. If Peter can do all of the above without leaking any information that Veronica doesn’t already know, then the protocol is zero-knowledge.
Correctness — Assuming Peter (The Prover) is honest
For simplicity’s sake, we will not worry about soundness for the moment. We will assume that Peter is honest.
The protocol proceeds as follows. Peter creates a New Graph. The New Graph is different from the Old Graph except for the fact that it preserves the length and cardinality of each path and cycle.
For instance, if the Old Graph has 2 cycles of length 3, then the New Graph will also have 2 cycles of length 3. If the Old Graph has 1 path of length 4, then the New Graph will also have 1 path of length 4. If the Old Graph has a hamiltonian cycle (a cycle with n vertices), then the New Graph also has a hamiltonian cycle. Therefore, a hamiltonian cycle in the New Graph is indirect evidence of a hamiltonian cycle in the Old Graph.
Peter does the following. First, he takes a graph and he duplicates it. Then he switches the labels of the vertices at random, making sure to keep track of the original labels. At the end of this, he will have produced a New Graph, and a chart that maps old vertices to new vertices. He keeps both of these artifacts private.
As you can see, all path and cycle lengths are preserved from one graph to another. In particular, the secret hamiltonian cycle in the Old Graph exists in the New Graph, but with differently labeled vertices.
Because the labels of the vertices are randomly permuted, the New Hamiltonian Cycle reveals nothing about the Secret Hamiltonian Cycle except the fact that it is a cycle with n vertices (which is implicit in the definition, anyway). Therefore, this indirect proof is zero-knowledge.
Soundness: A Check On The Prover’s Honesty
If we lived in a world where everyone was honest all the time, we could stop here. Peter would hand over the New Hamilton cycle, Veronica would check for correctness (i.e. that it is a cycle with n vertices) and they would call it a day.
But in reality, Peter could easily make the New Hamiltonian Cycle be the cycle from a totally random graph — which does nothing to prove the original statement.
Given the above, Veronica needs a way to check that Peter created the New Graph correctly (i.e. that it is the same graph as the Old Graph but with permuted vertices).
Let’s recap for a second. Veronica already has a New Hamiltonian Cycle (cycle with n vertices) from some New Graph. If Veronica trusts Peter, then she could stop here, accept the proof, and pay Peter for delivering all the newspapers.
But if she doesn’t trust Peter (as she shouldn’t), then she needs to do an extra step. She needs a way to confirm that the New Graph was created correctly.
To do this, Veronica could check the correctness of the New Graph directly.
Unfortunately, if Veronica has access to chart, the New Graph and the New Hamiltonian Cycle, then she will also have access to the secret, which would completely defeat the purpose of this protocol (as demonstrated in the picture below).
To get around this, Peter generates multiple New Graphs. After each New Graph (and corresponding chart) is generated, Veronica will flip a coin, and depending on the flip, will ask for one of the two items listed below:
- (Heads) A hamiltonian cycle in the New Graph (which proves the statement is true, given that Peter is honest)
- (Tails) The New Graph and corresponding Chart (direct proof that the New Graph was constructed correctly).
Since Peter doesn’t know which item will be asked for in advance, he has to guess Veronica’s coin flip in order to fake a proof.
Suppose Peter wants to cheat. If he’s expecting that Veronica will flip heads and pick #1 , he will try to fake a proof by creating some bogus New Graph with a random hamiltonian cycle that has no relation to the Old Graph. But if she ends up flipping tails, Peter’s fake New Graph will be exposed.
On the other hand, if Peter guesses that Veronica will ask for #2, he‘ll create a correct New Graph that doesn’t have a hamiltonian cycle. But if Veronica actually flips heads and asks for #1, Peter will be exposed when he is unable to provide a hamiltonian cycle.
In order for Peter to cheat and win, he has to guess Veronica’s coin flip before she flips the coin. Over one iteration, Peter has a 50% chance of succeeding at this. Over two iterations he has a 50% * 50% = 25% chance of winning. Over many iterations, his probability of convincingly faking a proof shrinks to a tiny number.
If Vernoica is still happy after many successful trials, she can be pretty sure that Peter does indeed know the Secret Hamiltonian Cycle (or a paper route that reaches all of her clients) — which is the goal we originally set out to accomplish.
Conclusion
Hopefully, you are less confused about zero-knowledge proofs than you were when you started reading the article. If not — feel free to leave questions in the comments :) And if you enjoyed the article and want to see more like this, give it a clap or two.