Zero-knowledge proofs, Zcash, and Ethereum

Published in

Keep Network

11 min readSep 19, 2017

Part 3 of a series on “Privacy on the Blockchain”.

In the third part of this series, I’ll focus on zero-knowledge proofs, a building block for greater financial and data privacy in cryptocurrencies, including Zcash and Ethereum.

At a bar, you’re casually discussing privacy options with your fellow patrons. One gentleman swears by Dash’s PrivateSend. Another fellow offers to sell you the finest Monero. The bartender is an Ethereum fan — she hasn’t been concerned about privacy because it’s “on the roadmap”. Between mixers, ring signatures, and master nodes, you wonder aloud whether there’s a better tool to ensure data and financial privacy.

At the end of the bar, a thin, slightly balding man looks like he’d like to join the conversation. You notice, and smile. The man leans nervously toward you, swallows, and in a hushed tone, says…

“…zero-knowledge proofs.”

Zero-knowledge proofs

Zero-knowledge proofs are an uncomfortable topic.

Mostly, they’re uncomfortable because they make people feel stupid, or make people worry that they’ll be made to look stupid. Cryptographers and developers alike struggle with the topic.

Zero-knowledge proofs are a category of cryptographic tool with many different flavors. As a concept, they aren’t scary, and are worth taking a little time to understand.

Like most things, there are layers to the topic that can be peeled back and studied. A little analogy can go a long way to understanding what zero-knowledge proofs are and what they can do.

Stranger danger

Imagine you meet someone on the street, and they claim to know your mother — she’s in the hospital, and you need to get in the car with them right now to go see her. You’re in a pickle. You’re worried about your mother, but by now you should be feeling some serious “stranger danger”.

You need to verify that this stranger is, in fact, a family friend you can trust. So you interrogate them, asking questions they should only be able to answer if they are indeed close to the family.

Assuming you ask good questions, the protocol you’ve just invented is an example of a zero-knowledge proof. You, the verifier, are verifying that the stranger, or prover, does indeed know your mother. You’re doing this interactively, coming up with questions that are difficult to prepare for in advance, unless the prover is who they claim to be.

That’s it. A zero-knowledge proof is when a prover convinces a verifier that they have some secret knowledge, without revealing the knowledge directly to the verifier. In our example, knowledge can’t be directly revealed, because we don’t have an easy way to “serialize” and share human knowledge, like having met your mother — just the loose approximations of verbal and visual language.

Challenge / response

A good example of a common zero-knowledge proof is a cryptographic challenge-response protocol.

Your friend Zooko tweets that he’s just had a wonderful pizza, despite his long and vocal hostility toward carbs. Concerned his account has been compromised, you send a DM, asking him to encrypt¹ the message “Yes, I really just ate an entire pizza. There wasn’t even any meat on it!” with his private key. If the ciphertext he sends back can be decrypted with his known public key, you know he still has access to his Twitter account².

An important point in that example is that you, as the verifier, chose the message. If the prover had chosen the message, and Zooko’s account had been compromised, the attacker could use any past message Zooko had encrypted with his private key accessible to them. For example, suppose Zooko had legitimately encrypted the message “I love meat” some time in the past, and the attacker had access to the ciphertext and plaintext. The attacker, as the prover, could use that message, duping the verifier in what’s called a replay attack.

So as long as Zooko has never encrypted that message before, you’re good. In practice, you should also include a nonce, or random number, in your message to ensure that it’s unique — or better, use a signature algorithm that handles that for you, rather than asymmetric encryption.

While most zero-knowledge proofs are similarly interactive, requiring the verifier to somehow interrogate the prover, there are variants where the prover doesn’t need to respond to a challenge from a verifier. Consider, for example, proving access to a file. The prover can publish a hash of the file. The verifier can be convinced that the prover has access to the file because of the computational infeasibility of otherwise coming up with that hash.

It should be clear that zero-knowledge proofs don’t “solve” privacy. Instead, they’re building blocks for privacy-preserving systems. Different types of zero-knowledge proofs can provide different functionality to these systems.

zk-SNARKs

Lewis Carroll wrote “The Hunting of the Snark” in 1876, coining the term for his imaginary creature.

When people in the cryptocurrency space say “zero-knowledge proofs”, they’re usually referring to a particular type of proof — zk-SNARKs.

The math underpinning zk-SNARKs is difficult to understand, but unless you’re implementing them, attacking them, or too paranoid to take a cryptographer’s word for it, you can skip the math and focus on what they do.

Let’s talk about the name. The “zk” stands for zero-knowledge. Amazingly, there are a number of other “snarks” in computer science, including a theorem prover and a type of graph, and outside of computer science, including imaginary creatures, video games, and sarcastic remarks.

This particular SNARK stands for succinct non-interactive adaptive argument of knowledge³.

You can read “succinct” as “efficient enough that it can be computed in a reasonable amount of time”, which is especially important for verification.

“Non-interactive” means that SNARKs don’t require the verifier to interrogate the prover. Instead, the prover can publish their proof in advance, and a verifier can make sure it’s correct, similar to hashing a file.

Finally, an “adaptive argument of knowledge” refers to a proof of knowledge of some computation.

What does that mean, exactly? Imagine your grade-school math teacher gives you a complex arithmetic problem. Instead of providing the answer (and showing your work!), zk-SNARKs let you prove you know the answer, without actually sharing it.

That’s a neat trick, but there are some caveats.

SNARKs are resource intensive. As we’ll see discussing Zcash, some of the computation involved makes certain use cases, including mobile and low-power device usage, difficult, though recent progress in this space has been encouraging.

There’s also the issue of losing access to a secret. SNARKs allows a user to prove they have access to a secret, but the onus is still on the user to maintain the integrity and availability of the secret. We’ll discuss this restriction in more detail when we discuss SNARKs on Ethereum.

The most significant, structural drawback to SNARKs, however, is what’s called the setup phase.

Setup phase

For each type of problem you want to solve with SNARKs, there’s an upfront communication step called the setup phase. In this phase, the circuit, or computation you want to prove, is fixed. Because of this restriction, SNARKs aren’t a good fit to run arbitrary Turing-complete smart contracts — each new contract would require a new setup phase.

To make this more concrete, each problem your math teacher gives you would need a separate setup phase. There might be one for addition, and another for multiplication. Once you’ve done the setup phase between you and your teacher for addition, it doesn’t need to be repeated again each time you’re given an addition problem. Any new sorts of problems require a new setup.

There’s another noteworthy aspect to the setup phase. In this phase, a secret is generated that allows fake proofs to be published, undetected. In a 2-party setup, that’s okay — the verifier (your math teacher) is the one generating the secret, and as long as the verifier doesn’t share the secret with the prover (you), security is maintained.

If you want to use a particular circuit publicly, with more than one verifier, there needs to be a “trusted setup”. Instead of a single verifier generating (and hopefully destroying!) the proof-manufacturing secret, a group of people can generate the secret together. As long as one of those people is honest, and destroys their share of the secret, the security of the setup is guaranteed.

For a more detailed, yet remarkably accessible introduction to SNARKs, check out Christian Lundkvist’s “Intro to zk-SNARKs with examples”. For more on the math, check out Zcash’s explainer, “zkSNARKs in a nutshell” or Vitalik Buterin’s series on “Zk-SNARKs: Under the Hood”.

Zcash

We’ve discussed zk-SNARKs in more than enough detail to talk about its highest-profile application, Zcash.

Zcash is a privacy-preserving cryptocurrency based on zk-SNARKs. In fact, it’s built on one particular SNARK circuit, the Zcash transaction verifier, with its own trusted setup. Zcash users can publish transactions, with public amounts, senders, and recipients, just like Bitcoin. They can also choose to publish proof that a private transaction follows the rules of the Zcash network, concealing the sender, recipient, and amount. In Zcash parlance, these are called shielded transactions.

As a privacy coin, Zcash often draws comparisons to Monero. The two projects take very different approaches to privacy.

While Monero’s ring signatures offer plausible deniability for each transaction, the size of the anonymity set is fixed — the record for the most participants in a single Monero ring signature is 4,500.

Zcash’s shielded transactions, however, have an anonymity set spanning every coin used in a shielded transaction. This is a fundamentally stronger privacy guarantee than those offered by ring signatures.

As discussed above, Zcash also inherits the downsides of zk-SNARKs.

The burnt remains of one machine involved in Zcash’s trusted setup. Photo by Peter Todd.

To create the currency, a group of cryptographers and well-known community members came together in a complex setup ceremony. Trusting the security of Zcash means trusting those participants didn’t collude, and weren’t compelled to hand over their share of the generated secret. If the shares did survive, anyone with access could produce counterfeit coins, though notably an attacker still couldn’t unmask transactions. Peter Todd, a security expert heavily involved in Bitcoin, shared his account of his participation in the ceremony. It’s well worth the read.

The performance characteristics of SNARKs also mean private transactions can’t be computed on less powerful devices, like the popular Ledger hardware wallet.

The Zcash team has made great strides on performance since their initial release. In the pending Sapling network upgrade, users will see significant performance improvements.

Ethereum

So far in this series, we’ve focused on financial privacy. Zcash is a high-profile application in the financial space, but zero-knowledge proofs are also a great tool to help ensure data privacy.

Ethereum is the highest-profile smart contract blockchain implementation. Unfortunately, it’s privacy story to date is poor. All details about a smart contract are public on the Ethereum blockchain or in full-node memory. All fund senders and recipients, all transaction data, all code executed, and the state in every contract variable are visible for any observer who cares to look.

The contracts on Ethereum today that do need to maintain data privacy rely on secure commitments. These simple schemes allow a user to commit to a secret value by publishing its hash to the blockchain, later revealing the secret, either on the blockchain or off-chain.

Unfortunately, by themselves, these hash / reveal constructions are incredibly limited. They have uses in gambling and simple digital asset exchange, but aren’t expressive enough to enable greater private data usage.

In Ethereum’s next protocol upgrade, Metropolis, smart contract developers will get a new privacy tool — the ability to verify zk-SNARKs efficiently on-chain.

What can we do with a SNARKs-enabled Ethereum? Certain contract variables can be effectively made private. Instead of storing the secret information on-chain, it can be stored with users, who prove they’re behaving by the rules of the contract using SNARKs. Each of these uses require their own trusted setup, but once a circuit exists, it can be easily cloned.

Imagine an ERC20-like token that doesn’t publish individual holders’ balances, while still maintaining a public and predictable token supply, or a lending platform that keeps the terms of a loan private.

As long as your contract data has a 1-to-1 correspondence with a user of the contract, and users can be trusted with access to the secret, zk-SNARKs is a great approach.

What you can’t achieve with SNARKs on Ethereum, however, is autonomous privacy, separate from a user. SNARKs on Ethereum rely on an off-chain party keeping a secret. Without an off-chain party, there’s nowhere to keep track of the secret, rendering the proofs useless.

Privacy without users

For many consumer applications, this isn’t a heavy burden. After all, many in the public blockchain space are philosophically aligned with maintaining user control of private information.

There are other valuable uses for private data on Ethereum, for both consumers and enterprises. A few ideas that would be difficult or impossible to implement on Ethereum:

Advanced decentralized governance. Autonomous organizations can’t store private information without delegating to a user as a “secret holder”.
Autonomous trading in a number of on-chain exchanges, including the 0x project.
Contracts that maintain sole “custody” of off-chain assets. Consider an Ethereum contract that needs sole custody of a Bitcoin wallet, for example.
Delegated access to identity, medical records, or other private information. SNARKs don’t enable any sort of access control on private data, requiring users to share private information off-chain.

Privacy on public blockchains, especially autonomous privacy, is hard. In the next post, we’ll discuss private and permissioned chains, as well as other approaches to maintain data privacy.

Thanks to Laura Wallendal, Corbin Pon, Brayton Williams, and James Prestwich for reviewing early drafts of this story.

[1] Yes, I know you’d just use a signature. Try explaining replay attacks with a signature in accessible language — you have to explain deterministic signatures, which is a little much for this already wordy post.
[2] Or, alternatively, that the hacker has gained access to both the Twitter account and private key. Yikes.
[3] As originally described in Bitansky et al.

Learn More

For more information about the Keep Network:

Join us on Reddit.
Check out our whitepaper.
Read our business primer.
Subscribe for email updates.
Follow us on Twitter.
Join our Slack.
Join our Telegram.