Lessons from the Auditing Trenches: “What do ZK Developers get Wrong?”
Zero-knowledge security is not easy. Our recent findings of our last 100 security audits show that our ZK audits have a 2x higher chance of having critical issue compared to the rest of the audits (mostly smart contracts).
Our CTO Kostas Ferles recently spoke at L2Con during EthCC, where he demonstrated examples of ZK bugs from our previous audits. He delivered a presentation titled:
Lessons from the Auditing Trenches: What do ZK Developers get Wrong?
Three ZK vulnerability examples
We have summarized the three ZK vulnerability examples Kostas covered in the presentation. Let’s go through the three bugs Kostas discussed:
Chapter I: The missing constraint bug
— Example 1. The business logic bug
— Example 2. The novice mathematician’s bug
Chapter II: ZK and DeFi are out of sync
—Example 3. When users decide their balance bug
Chapter I: The missing constraint bug
The missing constraint bug is one of the most typical bugs in ZK systems.
It means that a ZK developer forgets to introduce some constraint in the arithmetic circuits. An attacker can exploit this by creating a malicious proof, posting it on-chain, and steal funds.
During our analysis of the last 100 audits, we found that missing constraint bugs (or “underconstrained circuits”) had the highest rate of being critical or high in severity across all bug types. 90% of our missing constraint bugs across all ZK audits were critical or high in severity (!).
Let’s have a look at example of missing constraint bug.
Example 1: The business logic bug
Our first example is from an EVM privacy layer application built using Circom and Solidity. It uses UTXO-based infrastructure.
In the application, users own several private keys, and one of these keys is used to create nullifiers for UTXOs. A nullifier is a private value that, once revealed, invalidates the associated UTXO.
A nullifier is intended to be a function of a UTXO and a public nullifying key. Naturally, we only want to have one nullifier per UTXO. If multiple nullifiers can be created for the same UTXO, double-spending becomes possible.
The bug in this specific application was that one of the circuits did not check that the input public nullifying key was the one corresponding to the user’s private key. Since this constraint was missing, a malicious user could pass an arbitrary public key for a private key and create multiple nullifiers for the same UTXO, allowing the attacker to spend the same UTXO multiple times.
How can we prevent bug like these?
We can avoid such bugs by writing negative tests. This means testing the “bad case.” We pass a random public key (along with a private key and UTXO) and expect the test to fail. If the test doesn’t fail, we know a constraint is missing.
Example 2: The Novice Mathematician’s Bug
The next bug is from one of our audits where we audited a ZK verifier implemented in gnark. It is a bit trickier and involves some math.
At the core of this application was a library for arithmetic operations over the Goldilocks field. The bug was a subtle one: there was a missing constraint when calculating the inverse of a field element.
So, what is an inverse? It’s simple: the inverse of field element x
is another field element y
such that the following equation holds: x*y = 1
. However, there is a small caveat. If x
and y
fall outside of the Goldilocks range, equation x*y = 1
can have multiple satisfying assignments. This means that for the same input, you can prove it has multiple inverses, which shouldn’t be happening.
Because this library was central to the whole application, it could introduce multiple critical bugs since it was widely used.
How can we avoid these tricky bugs?
The answer is not as simple as in the previous example.
What you can do is document the assumptions of your modules. This way, if someone is building on top of your modules, they can see what necessary checks they need to perform.
“Pencil and paper” proofs can also help. You essentially prove a mathematical theorem that your constraints are correct.
One positive note is that part of these proofs can be automated. At Veridise, we have developer a verifier called Picus that can verify part of these proofs. It essentially ensures that your proofs are deterministic, checking that for the same input, you cannot have two outputs that satisfy the constraints.
Chapter II: ZK and DeFi are out of Sync
Generally, you want your ZK and DeFi components to be in sync. However, they can sometimes fall out of sync by forgetting to validate states.
Off-chain components don’t have access to on-chain data. Whenever you need a circuit to operate with on-chain data, you must pass the current data to the circuit, obtain the output, generate the proof, and post the overall proof on-chain. The on-chain component is then responsible for validating that the proof generated by the circuit is consistent with the on-chain state.
One way these out-of-sync errors can occur is if the on-chain component forgets to validate that the proof was generated using the current on-chain state.
Example 3: When Users Decide their Balance
In the third example we examine an application written in Circom and Solidity. This was a recommendation platform where users could interact privately.
Whenever users interacted with the system, they had to encrypt their balances, either on deposit or withdrawal. Users would encrypt their current balance, encrypt their new balance, and then post the encrypted data on-chain.
The application had a smart contract with multiple routes for depositing and withdrawing. However, some of these routes were missing a validation check to ensure that the proof was generated with the current balance.
For example, let’s assume a user has a balance of zero. The user informs the dApp of this zero balance and requests to deposit 1 token. The dApp acknowledges this, and the user’s balance is updated to 1. This makes sense.
However, a user could also claim a fraudulent balance. For instance, if the user had a balance of zero, they could tell the app that they wanted to deposit 1 token on top of a current balance of e.g. 1000 tokens. The user would then receive confirmation from the dApp that their new balance was 1001 tokens, even though they initially had a zero balance.
Avoiding these bugs:
If DeFi systems don’t check all necessary conditions, things can go wrong. To avoid these bugs, we can rely on our good old friend: the negative test. However, in this case, we need to write a negative test that exercises both the DeFi and ZK components (both on-chain and off-chain logic).
At Veridise, we have developed novel static analysis tools (like Vanguard) that thoroughly examine your on-chain logic and check if your contracts are missing any necessary checks.
In general, syncing DeFi and ZK components requires a global understanding of your system. These components can get out of sync in many ways, such as through proof replays (except for Aleo), arithmetic DoS, finite field overflows, and many more ways.
Generally speaking, these types of bugs affect systems where on-chain and off-chain components are written in different languages. We’ve noticed that if your developers are out of sync, your system has high risk of being out of sync as well.
Full presentation recording
Watch Kostas’ full presentation here:
Slides
You can download PDF slides of the presentation here.
Finally, it’s worth investing in security early on
If you’re developing zero-knowledge solutions, it’s worth investing in security early on. That means creating a realistic threat model, understanding what attacks are feasible in the framework you are using, and making sure your design prevents them.
We recommend to document important invariants and enforce them system-wide, write negative tests, and integrate automated security checks in your CI/CD.
Finally, we encourage you to partner with experienced security auditors who have demonstrated experience specifically in ZK audits.
Want to learn more about Veridise?
Twitter | Lens | LinkedIn | Github | Request Audit