Understanding zero-knowledge proofs without cryptography background

12 min readJul 6, 2023

The level of abstraction of almost all ZK-related tech info is currently either too technical or too generic, which, to me feels, is translated from online articles and videos and directly into offline discussions on events and meetups, making this whole topic quite unperceptive for non-computer-science people in web3, like myself.

I assume I’m not the only one here so let me try to challenge this.

Disclaimers:

I’m not an engineer so I will focus more on the WHYs that I’d understood mainly from discussions with my colleagues and other smart people I met in meetings and events. The HOWs … let’s keep them to tech guys (for now).
I try to be (but not necessarily are) super correct with definitions. If you are a computer scientist and you see bullshit, let me know:). I promise not to be too defensive.
The article is quite big. But, bear with me. If you’re the target audience (TA), it’ll be worth it.

The problem (if you feel the same, you are TA)

I read this maybe a dozen times by now that ZKs allow you to prove the fact of the statement without revealing the statement itself — with all those fancy examples, featuring Sherlock Holmes proving the color of the ball to a color-blinded Watson, Ali Baba cave, etc., but (unsurprisingly) this doesn’t really get you close to an understanding of the real applicability of the technology today and — what’s important for tech entrepreneurs — applicability tomorrow.

AI-generated picture based on input “Sherlock Holmes and Ali Baba not revealing their secrets” 😄

Then you hear about Snarks, Starks, Rollups, ZK-EVMs and while everyone seems to speak in plain English, you still get confused even more.

When you try to delve deeper, you get stuck in things such as in the figure below…

The abstraction level in-between I found only in private discussions and on Ethereum.org. You can think of this article as an aggregation of the two + a bit of my vision of this technology.

Pre-context (how my puzzles fit into a picture)

I recently got around this article that assumes that sometime soon ZKPs would help understand with total certainty which exact AI model you use if, for example, you pay for ChatGPT 4 and want to ensure it’s not the 3rd version that you talk to.

In general, it speculates on the idea that ZKps will help verify any API we consume. This caught my particular attention, making me think that web3 without ZKps is something like web2 without HTTPS. Wow, right?

After that, I had a conversation with my colleague, who worked in traditional banking for 20 years. He switched to DEFi lately and then built with us a decentralized banking software that’s 100% on EVM smart contracts (with basic lending and borrowing). In this conversation, he mentioned that while a bank that is 100% on smart contracts sounds like (and is) a great idea, compared to a traditional, centralized bank, it is still like comparing a nuclear power plant to a potbelly stove. If you’re interested, here he had written why as well as some ZK tech visionary for future banking.

Noteworthy, both the nuclear power plant (centralized bank with enormous regulations and procedures) and the potbelly stove (decentralized bank with basic but decentralized and uncensored lendings/borrowings) have different use cases, meaning one is not worse than the other, they’re just different.

The point is, you will ALWAYS have a trilemma between decentralization, scalability, and security. Good life and technology are always about sacrificing nice-to-haves for the must-haves, right?

And what is the must-have in web3? Total decentralization? To me, it sounds too populistic… I believe the must-have is the data provenance in EVERY digital accounting system (whether blockchain-based or not, whether centralized or decentralized), namely integrity, and authenticity of data/logic. Not ensuring those two properties in your API, platform, service, and website will be like using HTTP today (some still do by the way).

Maybe, the cornerstone of partially solving the trilemma may well be the ZK-tech itself. For now, let’s just reflect on this idea and finally figure out what ZKPs actually are…

Tip: Split all ZKp use cases into two categories

The ones solving the privacy problem
The ones solving the scalability problem (especially mind-blowing to me…)

ZKps solving privacy problems

Ignore this section if you already spent your hours browsing ZKPs for privacy.

Imagine there’s something you don’t want to share with somebody but still prove your point. Here are 3 examples, which I use as constructions in mind when grasping ZKPs for privacy.

Age verification

As an example, in California and other US states, Uber/Lyft passengers below 18 should be accompanied by someone who’s over 18, and at one moment those apps started asking the age of users, and some of them became frustrated about it (see below).

https://www.reddit.com/r/Lyft/comments/8jhn1d/why_is_lyft_suddenly_asking_me_to_verify_my_age/

Not everyone abides so decent drivers have to check the passengers’ age themselves, which is kind of weird and frustrating for the drivers as well.

To this point, ZKPs are great for use cases when you have a platform that does certain business activity and that became somewhat regulated by the government but its users are resistant to sharing personal information to obey this regulation. In order not to lose users in such a case, ZKps could be used in order to prove the point (e.g., I’m over 18) without revealing the details (exact age).

2. Whitelisting

Another example is proving you are “on the list” without revealing who exactly from the list you are. Imagine your Bitcoin address is whitelisted to use a certain online service, but you don’t want anyone (even the service that grants you access) to know which exact address you are, except for the fact that it’s in the whitelist. I know this still sounds a bit generic, but hopefully enough to get the construction of the use case and its applicability.

3. Anonymous reporting (this one’s my favorite)

I personally met one guy at the event who built this amazing platform in healthcare that uses ZKp to allow patients with incurable diseases, such as cancer, to share their unique disease reports so that researchers could investigate and create a cure that’s more specific to a particular case. The problem that ZKp solves here is proving the report is authentic (provided by the real patient) while not revealing the patient’s identity (not everyone wants to share their case and identity in order not to be prone to potential discrimination).

Here’s a link to the platform if you want to check it out — https://amrit.ai/. Their CEO, Ashwin Rathod, made a very big impression on me by targeting a very exact problem after his uncle died from cancer. He’s one of the people who sparked my interest in this technology.

ZK proofs, -Snarks, -Starks, -Rollups, ZK-EVM… Plain english, pleaaaase

In the figure below you can see approximately how I compose, let’s call it, a map of ZK tech solving scalability.

ZK-proof is the foundation on top of which everything else sits. Its goal is usually this generic definition that you’d probably read a dozen times. No surprise it uses cryptographic protocols (the hows, as agreed, let’s keep to cryptographers and mathematicians for now).

ZK-starks and ZK-snarks are specific types of ZK-proofs that are intended for almost the same goal but they are just more efficient (makes sense when your computational resources are limited). I don’t want to delve deeper as it’s not the goal of this article. For more information, check out this article on Ethereum.org — I find it quite well-articulating.

ZK-Rollups. Rollups are layer-2 solutions, aimed to scale Ethereum (and other L1s). Not all rollups use zero-knowledge proofs, but those that do are called ZK-rollups. This video is easy-going and really helped me understand what are rollups and why they need zk.

ZK-EVMs are the same ZK-Rollups but aimed to be more EVM-compatible than the general ones (meaning, with some assumptions, that they support whatever EVM/solidity support). In fact, it is claimed they are fully compatible. Remember, this tech is just appearing while I write these words and while you read them. Everything is still unclear but that’s what drives, right?

To conclude, ZK proofs, in general, are aimed at solving the privacy problem, and the rest (Snarks, Starks, Rollups, ZKEVM) are different types, applications, and technological tools aimed more at solving scalability. This is what brings me to my main point (find the meme below).

ZK Snarks (and co) are not used to hide the statement!

Now I’m going to be a little bit speculative but, as stated — change my mind.

I do think it’s a bad idea to say that the scalability-focused ZK tech is used to hide the details of the statement. The claim is correct but the narrative, I believe, is wrong as it makes newbies like me hard to connect the dots.

Evolution of scalability-solving problem (briefly)

You can ignore this section if you immediately understand the figure below.

L1 blockchains cannot (and don’t need to) spend their computational resources to validate all the transactions and initialize all the smart contracts, which is why engineers came up with L2s that process and validate transactions on themselves, meaning outside of L1s, and from time to time they timestamp their blockchain state to L1s.

L2s solved scalability but decreased security because TRXs are validated on less decentralized and less censorship-resistant L2s, which are easier to manipulate.

Now, rollups look like a better solution, where transactions are batched together off-chain but verified (with some assumptions) on-chain on L1. Again, not all rollups use zero-knowledge proofs.

For example, Optimistic Rollups are designed based on the idea which to me sounds similar to the taxation system in the US. It is assumed you pay taxes (i.e., upload correct transactions on-chain which spend existing coins), but if someone comes to you and finds out you play tricks, you will be punished. In terms of protocol, it means that before transactions become final on L1, validators randomly challenge some of them for validity. If someone uploaded a batch of transactions where some are incorrect, the portion of their stake will be taken away. The concept is clear. Not totally secure but it works, just like the taxes in the US 😄.

Then the ZK-rollups came to solve both the security and scalability problem (with some assumptions), and here’s where it gets extremely interesting, making me rethink the whole idea of the future architecture of financial (and not only) systems. Let me disclaim though that this tech is still young and in active development, meaning:

1. not everything that’s claimed can be done can actually be done today;

2. there are still lots of audits to do before this tech will be used at full commercial scale.

ZK-rollups also batch and process transactions off-chain but they use ZK to produce the so-called validity proofs of those transactions, which are then verified by L1. It means that we (almost) don’t spend L1’s computational resources on transaction validation while ensuring they are backed by L1’s security.

Sounds as crazy as if someone would have come up with a political structure that is as qualitative in decision-making as democracy and as fast as totalitarianism. Insane!

Example to understand ZKps for scalability

Thanks to numerous discussions I had with my engineer and cryptographer colleagues, we’ve figured out an example that I use in my mind as a construction to perceive all ZK-for-scalability use cases.

Imagine a method the goal of which is to calculate the average number among the set of numbers (see figure below).

Now, this may sound non-sense for now but only for the sake of example, imagine every time you use the method (i.e., provide some input and expect an appropriate output), you want to verify it calculates the average number correctly while you not having to:

1. review the code of the method each time you call it;

2. spend your own brain (or external calculator) “resources” to double-check that the calculation is indeed correct.

Imagine now that the method has done its work. It took 2, 1, 3 as an input and provided output 2. What ZKP does is that it allows anyone to ensure that:

The output (2) has indeed been generated using the method (getAverage) and not any other one;
The output (2) has indeed been generated based on input (2, 1, 3) and not any other one.

This means that if you audit the method just once before using it, you will be able to afterward call it and be totally sure that you keep using the same code (i.e., it has not been tampered with, it is authentic).

Now think for a second what it means. You ensure authenticity of logic but not with smart contracts, outside of any blockchain, outside of consensus. The code is initialized somewhere else, but you will still know that one (input) was definitely processed by the other (method) and the third (output) was formed as a result.

Noteworthy, you won’t even see the input, the output, and the method. They will all be hashed (see figure below).

Of course in our example not revealing the input, output and method doesn’t make sense as we are the ones who provided the input and received an output (just like it doesn’t matter to consciously hide transaction details from L1).

So it’s not really about “zero knowledge” (i.e., not revealing the statement). It’s about, if I may, “zero-resources” (i.e., not spending resources).

Just like in this example, we don’t want to spend our brain capacity to doublecheck the method, L1 does not need to spend its computational power to verify off-chain transactions.

Just like we don’t want to audit the code every time we call the method, L1 does not need to initialize the smart contracts on itself.

The problem that’s solved here for both L1 and us is not about privacy. It’s totally about scalability (not spending valuable resources) while maintaining the integrity and authenticity of data.

What does this really mean?

Coming back from where we started, I can imagine a future, where writing code that’s not compatible with ZKs will be poor tone just like building websites that still use HTTP instead of HTTPS.

Why? Because it would mean that for clients of your code (aka service, method, API, etc.), there will be no way to ensure with 100% that they actually consume what you tell them they do.

So I believe ZK-EVM is just the very first step, and we will keep seeing new technology frameworks that make other VMs compatible with ZKps (see figure below).

Important to note. Obviously, ZKps are not a panacea for achieving total data provenance just like blockchain isn’t. In both cases whatever you ensure provenance about has to be trusted/checked initially. In the case of ZKps, it’s an initial audit of the code that’s later checked via validity proofs. In the case of blockchain, it’s oracles (which you presumably trust) that input data from the real world into the blockchain.

There is just one main question for which I still haven’t found an answer (if you know and made it up until the end of this article, please DM me or comment) — is there certain complexity of code the validity of which ZKps wouldn’t ever be able to verify (is there are red line we cannot cross)? Obviously, it’s an early technology and a lot has to be invested into R&D (I get it), but is it right to assume that potentially there is no limit to the logic complexity that the technology can validity-prove?

If the last assumption is true, then I can imagine a future, where EVERY decent app is:

ZKp-compatible
Hence integrated with L1
Hence will have validity proof checkmarks for their users

Hence we have a situation, where regular users benefit from blockchain but it’s deeply under the hood. For the user, everything will work exactly like it does in web2 but with a validity proof checkmark in their UI (which gives additional guarantees of authenticity). Maybe this is where enterprises finally really meet blockchain.

What do you think?