Maths & Logic for Zero-knowledge Proofs (simplified) — Part 6

A primer for understanding the mathematical insights used to build ZK.

Biswashree Dey

Published in

Coinmonks

10 min readOct 20, 2023

Part 1 (Proof systems & NP proofs)
Part 2 (Computational complexity)
Part 3 (Interactive Proofs)
Part 4 (Negligible & Gain in Knowledge)
Part 5 (Honest verifier zero-knowledge)

Questions to answer

- HVZK is a weak-notion of zero-knowledge. How can we modify HVZK to make our zk protocol more robust against malicious counterparties? (Read 1 to 8)

1. Perfect Zero Knowledge (PZK)

If you recall from the last part, Honest Verifier Zero Knowledge (HVZK) only defines a zero-knowledge protocol where the verifier is honest. Hence, it is a weak-notion of zk and can’t really be implemented for practical cases where counterparties maybe malicious. Perfect Zero-Knowledge (PZK) provides a stricter definition. It also considers any malicious verifier who may use any and all efficient ways to interact with the prover and attempt to gain knowledge from them.

PZK models a protocol against all malicious verifiers by defining a simulator where the probability distribution of outputs by the simulator S is identical to that of an interaction between a prover P and any malicious verifier V*.

Perfect Zero Knowledge protocol. Source: [2] + Personal notes

We observe that S(x) is perfectly identical to (P, V*)(x). But for practical computing purposes such strict standards are not required. We just need to cover cases which are actually feasible with current computing powers. And that is the definition of CZK.

2. Computational Zero Knowledge (CZK)

CZK are protocols that cover cases where whatever the malicious verifiers can efficiently compute from their interactions with the prover, can also be computed using the simulator.

Computational Zero Knowledge protocol. Source: [2] + Personal notes

This means that the probability distribution of outputs by the simulator S is computationally indistinguishable from that of an interaction between a prover P and any malicious verifier V*.

But how do we determine computational indistinguishability?

3. Probability Ensemble

When we talk about the concept of sameness or indistinguishability, we talk about comparing two infinite sequences of strings and attempting to distinguish between the two. These infinite sequences of strings are called probability ensembles. So, in case of ZK protocols, one ensemble will represent the output of the interaction between the prover and verifier, and the other ensemble will present the output of the simulator.

A probability ensemble is a collection or sequence of random variables, where each random variable represents some kind of outcome or observation at a particular point in time or space. These random variables are indexed by a countable set, typically denoted by i, which can represent discrete time points, spatial locations, or any other meaningful indexing scheme. Example. Let’s assume to take the speed of a car over a 4 hour journey at each hour mark. Here i = 0, 1, 2, 3, 4.

X[0] = 45km/hr

X[1] = 60km/hr

X[2] = 55km/hr

X[3] = 49km/hr

X[4] = 65km/hr

Here are a few concepts you need to understand to fully appreciate the above definition.

Index set (i): This is a countable set that represents the indexing scheme for the random variables in the ensemble. It can be thought of as the parameter that determines when or where each random variable is observed. For example, if I represents time, then each index i∈I corresponds to a specific time point. As in the car speed example above, i denoted the hour mark when each measurement was taken.
Random Variables (Xi): For each index i∈I, there is a corresponding random variable Xi. These random variables can take on different values at each index point and represent the uncertainty or randomness associated with the process being modeled. The random variables in the ensemble are not necessarily independent; they may exhibit various dependencies or relationships with each other. In the car speed example, Xi represented the speed at each hour mark.
Ensemble Sequence (X1, X2, X3….Xi): The collection of all these random variables {Xi}i∈I forms the probability ensemble. It’s essentially a sequence of random variables, where each random variable represents a particular observation or outcome associated with the corresponding index point in the indexing set i.

Example :

Time Series: In time series analysis, i represents discrete time points, and Xi represents observations at each time point. Examples include stock prices, temperature measurements, and financial market data.

4. Computational Indistinguishability

Going back to the definition of computational indistinguishability used in CZK, the output of the prover and verifier interactions and the output of the simulator are both modeled as probability ensembles. These 2 ensembles (prover-verifier and simulator) can be either indexed by N (the set of all natural numbers) or

Computational Indistinguishability for ensembles indexed by N. Source [1] + Personal notes

indexed by a w (a set of binary strings of arbitrary length like “0010”, “0011”, “0000”, “1111”).

Computational Indistinguishability for ensembles indexed by binary string. Source [1] + Personal notes

Both the definitions can be unified if we represent the natural numbers N (Definition 1) as their unary representations 1^n (Definition 2). For example,

1 is represented as “1”
2 is represented as “11”
3 is represented as “111”
4 is represented as “1111”
…

So, you can think of each natural number N as a unary string “1^n” where “^n” indicates that the symbol “1” is n repeated time

For our purpose, we will go back to the definition of computational indistinguishability indexed by N and try to understand each of its parts.

Ensemble X = {Xn}n∈N: This ensemble is indexed by natural numbers N. Each random variable Xn is associated with some probabilistic experiment, and n is a parameter that describes the size or complexity of this experiment. The notation “1^n” is used to indicate that n is encoded as a binary string of length n, where each bit is “1.”
Ensemble Y = {Yn}n∈N: Similar to X, this ensemble is also indexed by natural numbers N.
Pr[D(Xn, 1^n) = 1]: This represents the probability that the algorithm D (algorithm trying to distinguish between the two ensembles), when given the input Xn and 1^n, outputs 1. In other words, it’s the probability of algorithm D accepting the output of the experiment associated with Xn when provided with the binary representation of n, i.e. 1^n.
Pr[D(Yn, 1^n) = 1]: Similarly, this represents the probability that the algorithm D, given the input Yn and 1^n, outputs 1. It’s the probability of algorithm D accepting the output of the experiment associated with Yn when provided with the binary representation of n, i.e. 1^n.
1/p(n): This term represents an inverse polynomial function. In computational complexity theory, “p(n)” typically represents a polynomial function of n. The condition |Pr[…] — Pr[…]| < 1/p(n) means that the difference between the probabilities of algorithm D accepting the outputs of Xn and Yn must be negligible as n becomes sufficiently large. “Negligible” is often used to describe a function that becomes arbitrarily close to zero as its input grows. In other words, as n grows, the two ensembles become computationally indistinguishable in terms of the probabilities observed by the algorithm D.

Ensembles Xn and Yn are said to be computationally indistinguishable if the difference between the probability of D accepting a random variable from Xn and the probability of D accepting a random variable from Yn is negligible → very nearly the same or indistinguishable.

It means that even a powerful adversary (represented by algorithm D) cannot efficiently distinguish between the two ensembles X and Y, making it computationally infeasible to break certain security properties of cryptographic systems.

5. Hybrid Argument

Now suppose, we encounter a malicious verifier who claims that they can distinguish between the outputs of prover-verifier interaction and the simulator which we have asserted to be computationally indistinguishable (i.e. verifier claims they can gain some knowledge from the zk protocol). We can use employ the hybrid argument to prove that a party claiming to have the ability to distinguish between two distributions is actually lying.

Assume you have a party (an adversary) claiming to have the ability to distinguish between two distributions: D0 and D1.

Setup:

You create a sequence of intermediate distributions that transition from D0 to D1, step by step. Each intermediate distribution should be “close” to the original distribution but gradually become more like the target distribution.

For example:

- Hybrid Distribution 0 (HD0): This distribution is identical to D0, the initial distribution claimed by the adversary.

- HD1: This distribution is a slight modification of HD0, where the differences between D0 and D1 are introduced in a controlled manner.

Proof:

You then prove, step by step, that the party’s claimed ability to distinguish the distributions is false:

- Prove that the adversary cannot distinguish between D0 and HD0.

- Prove that if the adversary can distinguish between HDi and HD(i+1), then it would imply that the adversary can distinguish between D0 and D1, which contradicts the assumption that D0 and D1 are indistinguishable.

HENCE:

By proving that the adversary’s ability to distinguish decreases as the distributions transition from D0 to D1, and by showing that the claimed ability leads to a contradiction, you establish that the adversary’s claim is invalid.

Proof of indistinguishability between D0 and D1 using hybrid argument. Source: [2] + Personal notes

The idea behind the hybrid argument is to create a series of intermediate “hybrid” states or scenarios between the starting state (where the protocol is assumed to be secure) and the desired final state (where the protocol achieves its security goal). Each intermediate state is analysed and proven to have a certain property, and by gradually transitioning from the initial state to the final state through these intermediate steps, the overall security property of the protocol is established.

Difference between two subsequent sequences is epsilon. If we have q such steps between D0 and D1, then the difference between the two is q*epsilon (which is a very negligible value).

Computational indistinguishability is established by using the hybrid argument on the operations of the algorithm D which is trying to distinguish between two probability ensembles X and Y.

6. Statistical Zero Knowledge

SZK is another variant of relaxation of PZK protocol, but it is less drastic than CZK. Here we assert that the outputs of the simulator and prover-verifier interaction are statistically different.

Statistical Zero Knowledge. Source: [2] + Personal notes

7. Statistical Difference

The statistical difference, also known as the variation distance, between two probability distributions, X and Y, is a measure of how different these distributions are.

Statistical Difference between two distributions X and Y. Source [1]

Probability Distributions X and Y: X and Y represent two different probability distributions. These distributions could be associated with random variables, events, or outcomes from some underlying experiments or processes. Each distribution assigns probabilities to various possible outcomes or values, denoted by α. In SZK, X and Y represent the probability ensembles of the outcome of the simulator and the prover-verifier interaction.
Variation Distance (Statistical Difference): The variation distance, denoted as Δ(X, Y), quantifies the difference between the two distributions X and Y. For the value a, what it the probability it will be present in distribution X and the probability it will be present in Y. Then, it takes the difference between these two values. It measures how much the two distributions differ in terms of the likelihood of different outcomes.
Absolute Difference (|Pr[Xn = α] - Pr[Yn = α]|) : Calculates the absolute difference in the probability of the outcome α between the two distributions X and Y. It tells you how much more or less likely α is in one distribution compared to the other.
Summation Over All Outcomes (∑α): The formula sums up these absolute differences over all possible outcomes α. This accounts for the differences across all potential outcomes, not just one specific outcome.
Normalization Factor (1/2): The factor of 1/2 is included to normalize the result to the range [0, 1]. It ensures that the variation distance is a value between 0 (when the two distributions are identical) and 1 (when the two distributions are completely disjoint).

In summary, the variation distance measures the discrepancy between two probability distributions X and Y by considering the absolute differences in the probabilities assigned to each possible outcome. It provides a quantitative way to compare how different these distributions are in terms of the likelihood of different values occuring in each of the distribution.

Smaller values of Δ(X, Y) indicate that the distributions are more similar, while larger values indicate greater dissimilarity between the two distributions.

8. Complexity Classes Based on ZK

Complexity of Different ZK protocols. Source: [1]

BPP: No prover exists. Verifier can completely and perfectly simulate their supposed “interaction” with an imaginary prover.
PZK: Ensembles of simulator and prover-verifier interaction are perfectly identical.
SZK: Ensembles of simulator and prover-verifier interaction are statistically indistinguishable.
CZK: Ensembles of simulator and prover-verifier interaction are computationally indistinguishable.
IP: Includes both cases where ensembles of simulator and prover-verifier interaction are indistinguishable (perfectly or statistically or computationally → zero-knowledge) as well as distinguishable (not zero-knowledge).

Now that we have covered the different types of ZK protocols, we encounter another problem. In real life, ZK protocol is NOT isolated. The protocol usually exists within some context — it is correlated to databases, or some information within it is correlated, etc.

How do we model ZK protocols that also counter any leak of knowledge even by using some external contexts?

You’ll get your answers in the next part.

Sources :

[1] Goldreich, O. (2001, January 1). Foundations of Cryptography: Basic Tools.

[2] The 9th BIU Winter School on Cryptography | BIU Cyber Center. (2018, August 5). The 9th BIU Winter School on Cryptography | BIU Cyber Center. https://cyber.biu.ac.il/event/the-9th-biu-winter-school-on-cryptography/