**Cryptology for blockchain data privacy protection**

Hello, I am Seunghwa Lee, a member of Coinplug’s platform development team. This article contains a brief explanation of blockchain technology, data privacy, and zero-knowledge proof.

**Contents**

0. Introduction

1. What is zero-knowledge proof

2. Terminologies related to zero-knowledge proof

3. Real life applications of zero-knowledge proof

4. To conclude

**0. Introduction**

Recently, there were a lot of privacy issues in relation to blockchain technology. Because of this, people are paying attention to the technology called “zero-knowledge proof”. But, why are these issues still rising in relation to blockchain?

The main reason blockchain is in the spotlight is that it can be used as a “public ledger” with strong decentralization. “Public ledger” literally means a ledger that everyone can see. In it, anyone can write, check the list, and verify its consistency. People have confirmed its usefulness by writing Bitcoin’s transaction history of virtual assets.

However, a list that ‘anyone can see’ became a problem in the stage of commercialization of this technology. All details, including when and how much virtual assets I send, are disclosed in the public ledger. Therefore, the ‘public ledger’ technology that could have value for the assurance of data consistency, had to face privacy issues. In other words, privacy problems always come along when creating a new application with blockchain technology. In contrast, centralized databases allow only certain administrators to access the information in order to prevent exposure of private data. The problem is that this system prevents public authentication and requires various intermediary systems to prove consistency of data. As you can see, it is difficult to meet these conflicting attributes of privacy protection and public authentication.

This caused people to be interested in zero-knowledge proof, a cryptography that can possibly solve this problem. In this post, I will introduce the basic terms and definitions related to blockchain and zero-knowledge proof.

**1. What is Zero knowledge proof?**

Let’s look at how these two contradicting attributes, privacy protection and public authentication, are satisfied by the zero-knowledge proof.

- This article is based on the Non-Interactive Zero-Knowledge Proof (NIZK), which is the most commonly used. If you want to know more about non-interactive and interactive zero-knowledge proof, click here (article in Korean).
- Zero-knowledge proof was proposed first by the Goldwasser et. al.[GMR].

Here is the dictionary definition of zero-knowledge proof:

Method in which the prover proves to another party (the verifier) that they know a value X without showing any other information apart from the fact that they know the value X.

Zero knowledge proof consists of three algorithms: **Setup, Prove, Verify.**

**Setup:**generates Common Reference String (CRS) for proof.**Prove**: Using CRS, it generates proof for knowing what information (*Statements)*we want to prove and the fact we know the confidential information (*Witness*).**Verify**: by using CRS, it verifies the prover has confidential information without providing direct confidential information based on*Statement*and*Proof*.

This proof system has three security properties: Completeness, Soundness, and Zero-knowledge:

**Completeness:**when provers create a proof with legitimate content and confidential information, verifiers can always verify them.**Soundness:**proofs that pass verification cannot be generated when the prover has no legitimate confidential information.**Zero-knowledge:**verifiers cannot know the confidential information through proofs.

To sum up,

Prover and verifier share the content they want to prove (Statement) and public value (Instance), while confidential information (Witness) is only kept by the prover. Prover cannot generate verifiable proof (Soundness) without confidential information (Witness) and verifier cannot get any confidential information through proof (zero-knowledge).

A famous example of zero-knowledge proof is the cave example but in this post, we’ll skip the technical matters and move to a more conceptual and real-life example: adult certification

The content that needs to be proved (Statement): Age>19

Confidential information (Witness): Age=23

Or when it comes to transactions, it could be expressed like this:

The content that needs to be proved (Statement): User1 sent money to User2

Confidential information (Witness): User1=A, User2=B, money=10 BTC

As seen from these examples, **zero-knowledge proof can prove the information you want in a public way without the need to expose confidential information in the process of proof.**

**2. Terminologies related to zero-knowledge proof**

The zero-knowledge proof concept is normally explained in jargon that combines mathematics, cryptography, and computer engineering technologies, making it difficult to understand. In this post I will explain different terms that you might encounter in documents related to zero-knowledge proof.

- For a description of the terms, they are referenced from the ZKProof community (ZKPRef), a zero-knowledge standardization organization.

**Common Reference String (CRS)**

For Non-Interactive Zero-Knowledge Proof System (NIZK), there is one convention between the attestant and the validator. In the Interactive, random values are exchanged to prevent cheating the other party, but non-interactive means that if the prover generates a proof, the verifier verifies it because they do not exchange additional messages when verifying the statement. The value for this is Common Reference String (CRS) which can be a pure random values method, or a random value associated with the content you want to prove. It is this CRS that many people think is the key to zero-knowledge proof, and this CRS is used by both the attestant and the validator.

**Statement, Instance**

*Statement* is the content you want to prove. In developer’s perspective, it is the process of performing a program that you want to prove. *Instance* is the parameter and value that are released. If you generate zero-knowledge proof for issuing an electronic ID, Issuing ID program is *Statement* and any released input, output, or constant values during the process of issuing the program is *Instance*.

**Witness**

*Witness* is the confidential information that the prover will not disclose. For example, when issuing an electronic ID, personal information like address and the social security number becomes the *Witness*. This *Witness* will not be disclosed during the process of proof.

**Relation**

*Relation* is a set of all pairs of information that will be exposed (*Instance*) and confidential information (*Witness*). If the *Statement* means “Did I get an electronic ID card?” *Relation* means the program that issues electronic ID cards with confidential information.

**Language**

If *Relation* means a set of *Instance* and *Witness*, *Language* means a set of all *Instances* that belong to *Relation*. *Formal Soundness* determines whether *Instance* belongs to *Language*.

**Proof of Knowledge (PoK)**

It literally means “to prove whether there is knowledge’. Generally, *Proof *proves the *Statement* is true (*Soundness*). This *Soundness* cannot prove whether the prover knows *Witness*. The safety of the *Proof of Knowledge* satisfies the *Knowledge Soundness, *meaning that the *Statement* is true and proved using the* Witness*. If there are two secret values in the same public value, *Proof *cannot distinguish between those two, but *Proof of Knowledge* can distinguish that they are proven with the secret value set by the prover.

**Argument**

In *Proof of Knowledge*, *Proof *implies the concept that any attacker (All-Powerful attackers) cannot generate a cheating proof itself. However, this is hard to prove efficiently. As an alternative to this, *Argument* lowers the attacker’s ability to prove that an attacker with computational capability (non-Exponential) cannot break the soundness. In addition, we call it *Argument of Knowledge (AoK)* if it satisfies *Knowledge Soundness.*

**zk-SNARK (zero-knowledge Succinct Non-interactive Argument of Knowledge)**

Proof that is *Zero-Knowledge Proof**, *** Non-interactive**, and satisfies

**(assuming the computational adversary). Succinct means that the size of Proof is small and its verification speed is fast. Succinctness properties emerged because we are expecting the verification process to be faster than actually performing the program. This zk-SNARK concept is one of the hottest topics in the Zero-Knowledge Proof field because the efficiency of satisfying Succinctness and the way of proving any program (proof of membership proof, range proof, etc) are well known.**

*Knowledge Soundness***3. Real life applications of zero-knowledge proof**

Let’s look at some examples of how zero-knowledge proof is actually used:

**Zcash****:** Zcash is one of the cryptocurrencies that hide the transaction history by using zero-knowledge proof. While hiding transaction information like who sent it to whom, and how much it was sent, among others, this cryptocurrency protects users’ privacy information while allowing a public verification by recording the transactions on the blockchain.

**Hyperledger Indy****:** Indy, a project for DID included in Hyperledger, uses zero-knowledge proof to create DID certificates with the URSA project. With zero-knowledge proof, it allows proof without exposing personal information.

**Mina protocol****:** One of the problems with blockchain is that it requires a large capacity due to its growing block size. But Mina protocol handles this problem by proving the block using zero-knowledge proof. In other words, users can prove the fact that the data recorded so far is correct with one proof.

**ZK Rollup****:** Although not yet used, Rollup attempts to solve one of Ethereum’s problems (transaction scalability) with zero-knowledge proof. The size of the smart contract had to be limited because all the contents of the existing transaction had to be written in the block. ZK Rollup manages this problem by making transaction contents into a single proof of zero-knowledge, recording only this proof in Ethereum.

In addition, there are many studies to replace various authentication systems, such as the Login system.

**4. To conclude**

I’ve briefly introduced the concept and why zero-knowledge proof is needed in blockchain. Although there are still many issues such as whether or not a trusted 3rd party is needed to create CRSs, as well as the process of transforming what you want to prove to fit into a Scheme, I hope this post provides some guidance to those who are new with this concept.

**Thank you for reading my article. If you have any opinions or issues related to this blog, please contact me through:*

*Email: contact@coinplug.com*

**References**

- [GMR] S. Goldwasser, S. Micali, and C. Rackoff. Knowledge Complexity of Interactive Proofs. Proc. 17th STOC, pages 291–304. 1985.
- [ZKPRef] ZKProof Community Reference v0.2, https://docs.zkproof.org/reference.pdf