Zero-Knowledge Proofs

How to prove you know (or have) something without giving up any information about what it is you know

Grayblock
Coinmonks
Published in
11 min readNov 26, 2018

--

In open blockchains like Bitcoin, Ethereum and many others, the transaction is public and open to all, showing who is sending it, who is receiving it and how much is changing hands. You can then see into all transactions in an out of those addresses, revealing their whole history. This level of openness makes many uncomfortable. So how can we make things a little more private?

How do you hide how much you are transacting, the data you are sending or your identity?

One simple solution could be hashing the entire transaction with the public key of the recipient, making it only readable by them. This is a great solution but this creates a couple issues. That data in the output is used for validating the transaction, populating the wallet with data and serves as the input to future outputs. Hiding that data behind a hash would therefore make it unavailable. Otherwise this is a great solution. Maybe we can solve these issues one by one. First, we have to prove that the data within the encrypted transaction is valid without giving up our privacy.

This is where zero-knowledge proofs come in. Zero-knowledge proofs refer to a proof construction that allows you to convince someone that you know something without revealing any information about what it is you know in the process. In a simple example, lets say that I, the prover, want to prove to you, the verifier, that I have two different colored billiard balls, but I don’t want you knowing what colors they are. So I blindfold you and put them in your hands. I have you put them behind your back and tell you that you can switch them or not then show them to me. I can then tell you if you switched them. Of course the first time I do this, it’s a 50–50 chance I get it right. But I have you do it over and over again, each time reducing the probability I got it right again by 50%. After several iterations of this you become satisfied that there is some difference between these balls and you hand them back to me and I put them away. I therefore proved to you that I have two different color balls without you knowing anything about them.

In cryptography this can be used to prove more complex things like given a hash, the prover could convince the verifier that there indeed exists a number, which the verifier knows, that satisfies certain criteria and that hashes to this value without revealing any characteristics of this number. This is what is used in creating confidential transactions where address and transaction balances are hidden on the blockchain behind hashes, or that you own the private key in order to spend a note, therefore hiding your identity.

Zero-knowledge proofs must satisfy three properties:

  • Completeness — If the statement is true, the honest verifier is completely convinced of the knowledge shown by the prover.
  • Soundness — If the statement is false, no dishonest prover can convince the verifier that the statement is true. In practice, these are often not completely sound but statistically sound.
  • Zero-knowledge — If the statement is true, the verifier cannot gain any knowledge of the statement other than that it is true. No other information about what made the statement true can be gained.

Let’s get snarky

There are currently three cryptocurrencies (Zcash, Ethereum, INT) who have proposed the use of zero-knowledge proofs (that I know of) and only one of those has successfully implemented them. Zcash, a hard-fork of Bitcoin, became the pioneer in zero-knowledge proof application in late 2016 with the use of a specific implementation of zero-knowledge proofs called zk-SNARKs, with “zk” referring to “zero-knowledge” and SNARK referring to “Succinct Non-interactive ARgument of Knowledge”. This allows users to send transactions without revealing their address, the receiver’s address, the amount being sent or revealing either of their balances.

The succinctness of the argument refers to the speed at which the argument can be verified with the whole of the argument reduced down to only a few hundred bytes even for statements about complex programs (in Zcash, the verification only takes a few milliseconds).

For the argument to be non-interactive, the prover must supply the verifier all that they need in order to verify the proof fully. In interactive zero-knowledge setups, a “conversation” can occur where the verifier can ask questions of the prover and the prover provides answers.

In Theory

This can be broken down into 4 main steps, which are general to all SNARK implementations.

  • Encoding as a polynomial — The statement that is to be proven is compiled into a quadratic equation of 4 separate polynomials, where the equality, t(x) h(x) = w(x) v(x) within a defined field, must hold in order for the proof to be valid. Polynomials have curious mathematical properties (called homomorphic properties) that allow operations to be shifted around without effecting the result and are very hard to factor to reverse engineer their definition. This property makes them very useful for transporting, concealing and making use of secret values without them being known.
  • Succinctness by random sampling — By giving the verifier the ability to randomly evaluate the polynomials with a secret value, s, it transforms the problem from multiplying polynomials to evaluating them at a specific point for equality, t(s) h(s) = w(s) v(s), thereby significantly reducing the work needing to be done to verify. By randomly sampling them at a given point, the chance of the functions being all equal at that point without satisfying the requirements of proof entirely are very small, ensuring completeness.
  • Homomorphic encryption of the secret value — If the secret value, s, was known, the prover could construct a proof that worked only for that s, therefore convincing the verifier that they have satisfied all conditions.* In order to hide the secret value so the prover doesn’t know where their proof will be evaluated, the value is embedded into a semi-homomorphic† function to allow the use of the value in the provers polynomials without revealing the value itself, ensuring soundness. This function, E(s), is then used by the prover to evaluate their quadratic proof at given point, s, by, t(E(s)) h(E(s)) = w(E(s)) v(E(s)), which because of the homomorphic properties is the same as E(t(s)) E(h(s)) = E(w(s)) E(v(s)) and also E(t(s) h(s)) = E(w(s) v(s))
  • Zero-knowledge — The problem with passing on t(E(s)), h(E(s)), w(E(s)) and v(E(s)) is that with the verifier knowing E(s), this would reveal some information about what the functions t(s), h(s), w(s), v(s) are. Therefore, in an effort to pass zero information on, the prover multiplies the function by a random number k, so the verifier can still make sure the “shape” of the result is the same just not where it is on the axis. Essentially, the idea is that t(s) h(s) = w(s) v(s) is identical to (t(s) h(s)) k = (w(s) v(s)) k for secret number k. And if all the verifier receives is numbers, there is no way to determine k and therefore the details of the functions, effectively proving the statement without allowing the discovery of any information that formulated the proof.

Now, I conveniently and gracefully slid right over the most difficult part of this whole process, encoding the statement into a “quadratic arithmetic program” (QAP). This is not something that will be covered here but will probably be done in a separate post and will not be for the feint of heart. This step is the reason why zk-SNARKs is not a trivial thing to implement which lead to Zcash having to encode some of their consensus rules into zk-SNARKs and Ethereum to downsize the set of available smart contracts in order to make this work. Even then, for a simple transaction in Zcash, it still takes 40ish seconds to construct the proof. Ethereum, on the other hand, cannot possibly enable global use of these proofs in their smart contract environment because of gas constraints so they have proposed a small subset of precompiled smart contracts that would work with zk-SNARKs.

*Special note: In these setups, the secret value s could be used to generate falsely verifiable proofs and therefore giving the power to create coins in the network. It is therefore imperative that great lengths are taken to make sure this value is highly protected. In Zcash, a multi-party ceremony was held where each member of a group of third party actors created a part of the final value with an air-gapped computer which was then destroyed.

Homomorphic means that you can change the order of mathematical operations without changing the result. In cryptography this becomes useful because you can do mathematical operations on encrypted data and the result is the same as if you did the operations before encrypting. In specific, the product of the encryption of two messages is equal to the encryption of the product of the message. Finding complete homomorphic function pairs is in practice very difficult and the subject of intense research. In these applications we settle for “semi”.

In Practice — Private Token Transfer

In a UTXO (Unspent Transaction Output) based framework (like Bitcoin, which is the basis of Zcash), the sum of the inputs must be equal to the sum of the outputs:

output1[toAddress1] + output2[toAddress2] = input1[fromAddress] + input2[fromAddress] + ...

where output2[toAddress2] is the change leftover from sending the entire balance (sum of all the inputs) of the address to a separate “change wallet”. This creates a constant chain of transactions for each Bitcoin from their genesis to their current holder. This doesn’t allow the users any privacy besides the pseudo-anonymous relationship between person and address. To fix this, Zcash effectively moved the burden of proving the satisfying of conditions for a valid transaction from the verifier to the prover, allowing the introduction of zero-knowledge proofs between the prover and verifier to shield the details of the transaction.

In order for the transaction to be valid, the above must return true as well as proving that you have the private key to spend the inputs, input1, input2 and so on. These can all be constructed as zk-SNARK proofs, proving that you hold the private key to spend the inputs and that the outputs sum up equal to the sum of the inputs.

But now that you have successfully hidden and proven those details for valid transaction, the details of the outputs (the “to” field and the amount), are still visible. So instead of storing the output on the blockchain like what Bitcoin does, Zcash stores an encrypted hashed version of the output as a “Note” on the blockchain. The note would be encrypted with the “transmission key” (think public key) who’s private key counterpart, the “view key” would be used to decrypt the incoming transaction, populating your wallet with the input. This makes it so other users cannot see the amount that is being transferred, nor can they associate the encrypted transaction to the transmission key owner.

Transactions are now completely opaque on the blockchain and the traditional framework of being able to see who sends what, where they go and who has how much is now solved. Your traditional transaction of two outputs is now a transaction of one zero-knowledge proof for verification and two encrypted outputs. [Fig. 1]

Fig. 1 The structure of a Zcash transaction

So how then do addresses spend these notes and the blockchain keep track of that so no double spending or unauthorized creation of coin occurs?

The note includes a nullifier value that, when published, signifies that the original note has been spent. The nullifier is hashed with the recipient’s transmission key and made publicly known.

Spending a note (shielded output) would then entail publishing the nullifiers of the notes you are using to construct your new notes as the inputs of your transaction, basically destroying the old note and creating a new note, nullifier pair. You can see here again the importance of keeping that s hidden.

This transforms the blockchain from being UTXO to UNN (Un-Nullified Note) based.

In Practice — Private Smart Contracts

Transactions like described above are really just simple smart contracts. Verify these x criteria and token ownership is transferred. Therefore, applying zero-knowledge proofs to more generalized contracts should, in theory, be about being able to define the problem into a QAP that can define all of the verification criteria of that smart contract. While Ethereum has incorporated zk-SNARK compatibility into their base code, client support has not yet been rolled out. First lets look at a basic transaction to understand the fundamental differences between it and Zcash.

In an account based transaction framework (like Ethereum), there exists a mapping between addresses and balances and when sending a transaction of amount value , the following must be satisfied:

balances[fromAddress] >= value

and is done to make sure that the address has enough token to send the transaction amount.

The transaction is also sent with a nonce. Unlike the nonce used in mining (which is just a random number added to the hash in order to make it fit a pattern) this nonce is used to assure that the transactions are maintained in the order they are to be processed. If the nonce is not one or more than the nonce of the previous transaction, the transaction will be rejected.

These two validating criteria, along with proving you have the private key to sign transactions for that address, can be represented as zero-knowledge proofs fairly easily.

But what about interacting with a contract? Contracts not only require the above but also require some amount of input that is dependent on the contract itself. Implementing zk-SNARKs for a contract would entail enabling the contract to verify zero-knowledge proofs and transforming function calls or satisfying the conditions of the contract to deal with encrypted information. This adds a whole layer to existing contracts which drives gas cost for the transactions. This limits the usage of zk-SNARKs significantly in Ethereum because even for simple contracts, the gas used in order to verify the proof consume half of a block’s worth of gas.

This has driven a down-selecting of viable smart contracts for zk-SNARKs to ones that can operate with the current block gas limit and with the current efficiencies of this implementation of zk-SNARKs. These precompiled smart contracts have the ability to do simple transactions and elliptic curve operations.

Zcash is also working with Ethereum in a project called Project Alchemy or Zcash on Ethereum (ZoE). This project is looking to bridge the two blockchains to make use of the programmability of one and the privacy of the other to open up another realm of applicability. This integration would open up the creation of custom anonymous tokens that could be used in dAPPs just as current Ethereum tokens are, enabling real world use of important and sensitive contracts, e.g. voting, identity authentication or other sensitive applications.

This bridge would be constructed much like what BTCrelay has done with Bitcoin and Ethereum, as is what INT is looking to ease the enabling of with their heterogeneous multi-chain framework. For more on blockchain interoperability, read this post.

The Future

Zero-knowledge proofs find a balance of complete and ensured privacy with the selectability for wide application use. The current drawbacks associated with the resource intensiveness of proof generation is the topic of intense academic research, the latest improvements of which you can read here.

With the growing interest in privacy and applicability of blockchains, more effort will be put into developing sound and compete systems for privacy protection and some sort of zk-SNARK implementation will certainly be at the center of it.

--

--