# On Interactive Proofs and Zero-Knowledge: A Primer

## Joint work with Sebastian Gajek

What can you expect from the read

- Basic understanding of the properties of proof systems, including zero-knowledge, soundness and completeness and thus the fundamental properties of zkSNARKs, zkSTARKs, or zkNIZKs
- The simulation paradigm, computational indistinguishability and witness extraction
- Schnoor’s protocol, including the showcase proofs of soundness, zero-knowledge and completeness

Interactive proofs are essential concepts to the very notion of verifiable computation — a powerful cryptographic tool to assure the computational integrity of smart contract, program or database executions. Most importantly, the theory of interactive proofs paves the ground for non-interactive zero-knowledge proofs (NIZKs), succinct non-interactive arguments of knowledge (SNARKs), or succinct transparent arguments of knowledge (STARKs). In this article we describe the foundational theory behind interactive proof systems and hope to ease the entry for newcomers into the very exciting and promising field of SNARKs and STARKs.

# 1. Introduction

Proof systems have recently gained tremendous interests in the context of assessing the integrity of computation. For example, they can help implementing a “cryptographic” auditor to certify that a mathematical function like a search query over a database was indeed computed, thus implying the correctness of the output in the light of a potentially malicious database owner. Zero-knowledge is a wonderful property of proof systems to guarantee privacy of the computation. It enables to verify the computation over the dataset without revealing any partial information about the data. Against this background, zero-knowledge systems find brought application in privacy-sensitive domains like anonymous identification, database set membership queries or Blockchain payments.

## 1.1 What is a proof system?

In computational complexity theory, an interactive proof system is an abstract machine that models computation as the exchange of messages between two parties. The parties, known as prover **P** and verifier **V**, interact by exchanging messages in order to ascertain whether a statement is correct.

The prover is all-powerful and possesses unlimited computational resources, but cannot be trusted, while the verifier has bounded computational power. Messages are sent between the verifier and prover until the verifier can validate or falsify a statement. All interactive proof systems have two key properties:

**Completeness:** if the statement is true, the honest verifier (that is, one following the protocol properly) can be convinced of this fact by a honest prover.

**Soundness:** if the statement is false, no cheating/malicious prover can convince the honest verifier that it is true, except with some negligible probability.

Put in different words, the soundness property asserts that the proof verification procedure cannot be “tricked” into accepting false statements. Thus, soundness captures the verifier’s ability to protect itself from being convinced of false statements (no matter what the prover does in order to fool the verifier). On the other hand, completeness captures *simply* the ability of some prover to convince the verifier of true statements (belonging to some predetermined set of true statements). *Note that both properties are essential to the very notion of a proof system.*

## 1.2 Privacy through Zero-Knowledge

Zero-knowledge (interactive) proof systems have the remarkable additional property of yielding nothing beyond the validity of the assertion. Note however that the property makes only sense, if the prover has some secret information — in zk-jargon referred to as **witness** — and wants the verifier to learn no partial information about it.* *Hence, the third magical property of proof systems can be summarized as

**Zero-knowledge:** if the statement is true, no cheating verifier learns anything other than this fact.

In the zero-knowledge literature is a canonical example to explain the property. In this example, there is a cave whose path divides into two. These two paths are connected by a door that can only be opened with a secret word. One person, called Alice, wants to prove to another person, Bob, that she knows the secret word without betraying it. For this, both parties first place themselves at the entrance of the cave. Then Alice enters and follows one of the two paths — which is up to her. Bob can not tell from the entrance which path Alice chose.

Then Bob enters the cave and runs to the fork. He chooses one of the two paths and calls this into the cave. Alice’s task is to get through this path to Bob. If she succeeds, she is one step closer to the goal of proving to Bob that she knows the secret word.

If Alice chose the path that Bob chose, she does not need the secret word. Only if she chose the opposite path, she must come pass the door to get on the right path. This means that a malicious Alice, who is unaware of the secret word, can only convince Bob to 50% in a single run of this protocol. (That’s why Alice and Bob shall repetitively the protocol to reduce the soundness error probability.) The protocol is a zero-knowledge proof because no matter how many times Alice and Bob repeat the process, Alice, if she knows the secret word, will always follow the path chosen by Bob. Meanwhile Bob learns nothing about the secret [1].

## 1.3 The Canonical Application: Identification Schemes

One of the first application of zero-knowledge interactive proof systems have been identification schemes. In an **identification scheme **a party, called the prover Alice, aims to convince another party, the verifier Bob, it is a particular party. This process of entity authentication is an ingredient to granting the party some particular service, such as access to the most recent episodes of Games of Thrones because the content provider identified the party as a premium member. To this end the party is in possession of a secret witness, such as a cryptographic key or password, **no** other party including the content provider is aware of. Let us now exemplify the three properties in the Games of Thrones streaming case.

**Completeness**guarantees that provided Alice is a premium subscriber, it will get access to the premium content.**Soundness**guarantees that no other party can claim to be Alice and get access to the premium content.**Zero-Knowledge**ensures that no other party including the content provider learns the secret witness and can impersonate Alice in future identifications. The notion of zero-knowledge implies that the proof of identity is non-transferable.

At the bottom of identification schemes lies the idea that Alice knows some secret (directly linked to her identity) and she proves to Bob knowledge of the secret. To prevent a malicious Bob from impersonating Alice in future, expect from the protocol that Bob learns no partial information about Alice’s secret. Vice versa, a party impersonating Alice shall not produce a valid proof and get access to the GoT streams.

# 2. Formal Definitions: The Devil is in the Detail

While the properties of a proof system may seem intuitive, defining completeness, soundness or zero-knowledge in a rigorous and mathematically sound way took the research community decades. In order to formalize the properties we will need a model of computation to express what an interaction between two computing devices is. In cryptography the preferred choice is given to Turing machines named after the “father” of computer science Alan Turing.

## 2.1 Interactive Turing Machine

An interactive Turing machine (ITM) is a (deterministic) multi-tape Turing machine (*see Figure 1*). The tapes are a read-only input tape, a read-only random tape, a read-and-write work tape, a write-only output tape, a pair of communication tapes, and a read-and-write switch tape consisting of a single cell. One communication tape is read-only, and the other is write-only. Each ITM is associated a single bit, called its identity. An ITM is said to be active, in a configuration, if the content of its switch tape equals the machine’s identity. Otherwise the machine is said to be idle. While being idle, the state of the machine, the locations of its heads on the various tapes, and the contents of the writable tapes of the ITM are not modified.

The content of the input tape is called input, the content of the random tape is called random input, and the content of the output tape at termination is called output. The content written on the write-only communication tape during a (time) period in which the machine is active is called the message sent at that period. Likewise, the content read from the read-only communication tape during an active period is called the message received (at that period). (Without loss of generality, the machine movements on both communication tapes are in only one direction, e.g., from left to right.)

You may now want to call the interactive turing machine ** M₁** the prover

**and**

*P***the verifier**

*M₂***,**

*V**respectively*

*.*Like humans, machines need to understand a language

**to properly interact. We do not plan to make a deep dive into language theory and complexity here. All what we need to know is that the statement the prover**

*L***wants to prove must — very loosely speaking — be encoded in a particular language. With**

*P***and**

*(x,w) ∈ L*

**we express a correct/incorrect proof statement or in other words the pair**

*(x,w) ∉ L***(x,w)**is a member of the language

**L**.

**Typically the value**

**x**is public and known to both the prover and verifier and the parameter

**w**(called the witness) is private and known to the prover only.

Formally a language **L** is defined as a set (possibly infinite) of strings over some finite alphabet and are formed to a specific set of rules. For example a language **L** over the alphabet **Σ** = {0, 1, 2, -, =} can have the following grammar:

- Every non-empty string that does not contain “-” or “=” and does not start with “0” is in
**L** - A string containing “=” is in
**L**if and only if there is exactly one “=”, and it separates two valid strings of**L**

Under these rules, the string “2–1=1” is in **L**, but the string “=21=0-” is not. During the setup phase of a communication protocol, the interacting parties often agree upon a specific language with specific rules.

## 2.2 Interactive Proof Systems

A pair of interactive turing machines **(P, V)** is called an interactive proof system for language **L** if machine **V** is polynomial-time and the following two conditions hold:

**Completeness:***For all**(x,w)**in the language**L**, the probability that a prover**P**can convince the honest verifier**V**is significant. That is,*

**∀(***x,w) *∈ *L : *Pr[ ⟨P(x,w),V(x)⟩ = 1 ] ≤ 1 — negl

**Soundness (Weak Version aka Unforgeability):***For all**(x,w)**not in the language**L**, the probability that a cheating prover**P’**can convince an honest verifier**V**is negligible. That is,*

**∀(***x,w) ∉ L : *Pr[ ⟨P'(x),V(x)⟩ = 1 ] ≤ negl

**Special Soundness (Strong Version aka Knowledge Extractability):**

*For every **(x, w)** in the language **L**, there exists a polynomial-time algorithm **E **(the extractor) such that the witness **w** can be extracted from valid conversations with the prover **P** and the verifier **V**. That is,*

**∀(***x,w) *∈* L , ∃ E : *Pr[ E⟨P(x,w),V(x)⟩ = w ] ≤ 1 - negl

Soundness comes in two flavors. The weaker notion of soundess captures a bit the intuition of an unforgeability property. In fact, proof systems and digitial signatures have much in common. Through the Fiat-Shamir transformation one can turn a class of protocols known as Sigma protocols into digital signatures. We discuss the technique in a forthcoming article. The stronger notion is often referred to as the *witness extraction *property*. *Protocols satisfying the property are called **proofs of knowledge.** Essentially the existence of the extractor frames in an *algorithmic way *the idea of computational knowledge. Suppose the prover **P** knows the witness **w**. Then she must have used the witness during the protocol execution (and thus encoded it somehow into the protocol messages) in order to successfully convince the verifier. So suppose we have a magical angel, the extractor, watching the back of **P** and attesting the witness usage in the protocol execution. Clearly that would yield a strong proof of knowledge of the witness. In the computational world angels sadly do not exist. Rather **P** and **V** are Turing machines and so is the extractor **E**. To capture the intuition of “watching **P**’s back” we let the extractor *compute* *the witness* from the observation of the interaction of the prover **P** and verifier **V**. Put in other words, if you can pull the witness out of **P**, it must somehow have been there. The point that must be stressed here is that a simple passive eavesdropping of the interaction does **not** leak any partial information about the witness. Otherwise the verifier **V **learns the witness **w** and the protocol hardly satisfies any form of zero-knowledge. The extractor **E **(as opposed to the verifier) hence is given some extra power! In the forthcoming example below we will observe the extra power is the ability to rewind the prover **P**. Rewinding is a technique which reflects the core idea of programs: One can execute a program multiple times (think of calling a function) and it outputs the deterministic outcome, if one controls the program’s random tapes/inputs. Looking ahead, rewinding the prover **P **(specifically, executing Schnorr’s protocol twice while rewinding the prover to a particular state before the second execution) suffices already to gather all the relevant information for the witness extraction. Note again that for the protocol to satisfy zero-knowledge the verifier must lack that additional extra power.

**Zero-Knowledge:***For every**(x,w)**in the language**L**, there exists for all verifiers a simulator**S**such that no polynomial-time distinguisher**D**can distinguish an execution of the simulated protocol from that of a real interaction between the prover**P**and verifier**V**. That is,*

**∀(***x,w)*∈*L, *∀V *∃S: *Pr[ D⟨P(x,w),V(x)⟩ = 1 ] - Pr[ D⟨S(x)⟩ = 1 ] ≤ negl

We first need to recall the very notion of computational indistinguishability to understand the simulation mechanism. Indistinguishability is a powerful concept in numerous fields of computer science. It is formalized by an efficiently computable decision making algorithm, the distinguisher **D**, who on input a string sampled from either of two sets must decide from what set it was taken. Consider, for ease of exposition, the following mental experiment. Suppose a human and a machine simulating the human are hidden behind a curtain in a theatre and the audience interacts with either of the two through the curtain. After a long conversation, if the audience cannot tell apart the human from the machine (say we had a 50:50 decision), we would conclude the machine is “as good as” or* “*sounds/looks/behaves as” the human.

Now consider the case where we compare the interaction of the prover **P** and verifier **V** with the execution a simulator **S** produces with the following caveat. In the real interaction between the prover and verifier, **P** makes explicitly use of her witness **w **while the simulator **S** is only given the public information **x**. Despite the information disadvantage, let’s assume the simulator produces a transcript that “looks like” an interaction between the prover and the verifier (see Figure 2).

Then it must be the case that the real protocol transcript leaks as much information as the simulated transcript. Note it must be like that. Otherwise the executions were distinguishable. By construction, however, the simulated transcript is zero-knowledge: It leaks no partial information about the witness **w**. In fact, the simulator can’t leak any information at all because it is **not** in possession of the witness. The conclusion must be that the real protocol transcript leaks no information about the witness as well, as the simulated transcript is indistinguishable, that is as good as the real one. Hence the real protocol is zero-knowledge.

# 3. Schnorr’s Protocol aka the “Hello World” Honest-Verifier Zero-Knowledge Proof System

One of the simplest and most frequently used proofs of knowledge, the proof of knowledge of a discrete logarithm, is due to Claus P. Schnorr. He was a German mathematician and cryptographer. His protocol has three rounds and is defined over a cyclic group ** G** of order

**with generator**

*q***. In order to prove in zero-knowledge the language**

*g***L={ (x,w): x = gʷ }**where the witness

**is the hard-to-compute discrete log exponent, the prover interacts with the verifier as illustrated in Figure 3.**

*w*## 3.1 Proof System Specification

**Setup:**

- The prover
**P**has secret input**w**and public input**x=gʷ**. - The verifier
**V**has only public input**x=gʷ**. - Both
**P**and**V**have common the public group parameters**g**and**q**.

**Protocol:**

- Prover
**P**generates a random group element**h**and samples a random integer**r**. She sends the random value**a=gʳ**to the verifier**V**. - Verifier
**V**chooses a random challenge**e ∈ {0,..q-1}**and sends it to the prover**P**. - Prover
**P**responses to the challenge with the answer**z=r+ew**. - Verifier
**V**accepts the response (and thus is convinced of the proof), if and only if**gᶻ**equals**axᵉ**.

## 3.2 Security Analysis

We now analyze the Schnorr proof system and argue it is meeting the above properties.

**Completeness** is shown by a simple equivalence reformulation:

**∀r,e : gᶻ = gʳ⁺ᵉʷ = gʳ(gʷ)ᵉ = axᵉ**

**Special soundness **is shown in Figure 4 by constructing the extractor **E **that computes the correct witness ** w **and thus breaks the discrete log assumption (namely it is assumed to be a computationally hard problem to compute the witness

**w**given

**x=g**), thus violating the weak soundness property. In other words we reduce the protocol soundness to the discrete log problem with help of the extractor

*ʷ***E**.

For the extraction to work, **E** runs two instances of the prover, denoted as **P** and **P***. During the simulation the extractor acts as the verifier **V** for the two instances and computes the witness through the following strategy:

- Extractor
**E**feeds the first prover instance**P**with**x=gʷ**(and the random coins**r**to parameterize its random tapes) and waits to receive the first prover message*a=gʳ.* - Extractor
**E**chooses a random bit**e ∈ {0,..q-1},**sends**e**to the prover**P**and waits to receive the response**z=r+ew***.* - Now it rewinds the prover
**P**to step 2. (Technically the extractor invokes a second prover instance**P***with identical input. The fact that we invoke the prover with the same randomness**r**makes sure she makes the same (random) choices as**P**did for the first message**a**. Hence we rewinded**P**.) - Extractor
**E**chooses a different**e’,**sends it to**P***andwaits to receive the answer**z’ = r+e’w**. - Lastly the witness
gets extracted through computing the differences of the challenge-response pairs.*w*

**(z-z')/(e-e') = ((r+ew)-(r+e'w))/(e-e') = (e-e')/(e-e') · w = w**

**Zero-knowledge:** A protocol is deemed zero knowledge if for any verifier **V** there exists a simulator that can create conversations with the same distribution as conversations between the prover and verifier. We stress that Schnorr’s protocol (as specified above) is **NOT** zero-knowledge. The reason lies in the inability to construct a simulator for all verifiers: The simulation falls short for malicious verifiers with unpredictable choices of message **e**. For the simulation to go through, however, the simulator must guess the challenge **e** in advance. Given the exponential sampling space of **e**, the odds this event happens is negligible.

Schnorr’s protocol satisfies however a weaker privacy notion known as *honest verifier zero-knowledge*. In this model, the verifier follows the protocol execution blindly and does not deviate from the above protocol specification. Under this simplification the simulator no more needs to anticipate arbitrary and thus hard-to-predict behavior of the verifier. Rather the simulator can now simulate a random choice of the challenge message **e **ahead of time, as — by assumption — a honest verifier would behave exactly like that. It is even possible to take any given value **e** and then produce a conversation where **e** occurs as challenge. In other words, the simulator does not have to insist on choosing **e** itself, it can take an **e**-value as input. That makes the simulation stronger, as it shows zero-knowledge in presence of statically corrupted verifiers. To see this, we construct the simulator **S** given the public input* ***x=gʷ **in the following way:

- The simulator
**S**chooses**z**uniformly at random and the challenge**e**the honest verifierwill issue.*V* - The simulator
**S**computes the prover’s initial message as. Furthermore it simulates the verifier’s challenge message with*a = gᶻ /xᵉ***e**. - Finally, the simulator
**S**simulates the third message**z**.

Analysing the simulation, we see the simulator **S** creates a valid interaction of prover and verifier. The verifier accepts the simulated protocol messages because it holds that

**a·***xᵉ* = (*gᶻ/xᵉ*)·*xᵉ* = *gᶻ*

The simulation (**a, e, z**) has exactly the same probability distribution as real conversations between the honest prover and the honest verifier. (Although it remains to prove that.)

# 4. Conclusion and Outlook on Sigma-Protocols and NIZKs

One might argue that honest verifier zero-knowledge is not a useful property, since it does not make sense in practice to assume that a verifier is honest. The point, however, is that protocols with this weaker property can often be used as building blocks in constructions that are indeed secure against *actively cheating*, as we shall see. Protocols which have the same three-move structure (commitment, challenge, response) such as Schnorr’s protocol are called **Σ-protocols**. Efficient Sigma protocols exist for proving various statements, such as those pertaining to discrete logarithms. Moreover, through the Fiat-Shamir transformation they can efficiently be turned into **non-interactive zero-knowledge proofs of knowledge (NIZKs)**.

We plan to write about Sigma-Protocols, the Fiat-Shamir Transformation along some concrete examples of NIZK constructions in the next article. Stay tuned!

# Further Reading

The theory of interactive proof systems is by far more elaborated than what we treated in this article. Here is a list of papers we recommend to dive deeper into this exciting field.

- Claus P. Schnorr: Efficient signature generation for smart cards. Journal of Cryptology, 4(3):239–252, 1991.
- Mihir Bellare and Oded Goldreich: On Defining Proofs of Knowledge. Crypto’92.
- Oded Goldreich: Foundations of Cryptography — Basic Tools. Cambridge University Press, 2001.