# Walkthrough of an Interactive Zero-Knowledge Proof for Sudoku Puzzle

## 🔏 Learn how to prove possession of a Sudoku solution with Zero-Knowledge and build a PoC with pure Python.

# 📖 Introduction

Without a doubt, we currently live in a data-driven society where in fact, data has become a more valuable resource than gold or oil. Seriously, consider the amount of personal data that we share online every minute, on a daily basis. Location, feelings, preferences, passwords, messages… and the list keeps growing…

Fortunately for us, symmetric and asymmetric modern crypto made it possible to protect our data against malicious adversaries attempting to eavesdrop on our communication channels. But what about the *data controllers* — the guys who we legitimately send the data to? How can consumers make sure that their data are not mishandled or abused? One way for sure is refusing to send data in the first place. But in reality, is not that simple. It’s an exchange. We exchange a bit of privacy for some kind of service that they provide right? Still, on many occasions, this exchange is not a fair one from the consumers perspective, since data handlers ask for way more data than what’s actually needed.

Yet again, cryptography may have a solution for this. What if I told you that it’s possible to avoid sharing your data after all. For instance, instead of sending a complete salary overview and job details to a letting agency for a credit check, send only a proof that you earn more than 40k per annum. This is exactly what Zero-Knowledge proofs (ZKP) provide! ( It becomes slightly more complicated than that in practice for whole lot other reasons. Like how do we trust the origin of the data etc… but you get the idea 😅 )

Although there are many ways in which you can construct ZKP, in this post, I will be giving a walkthrough (with full code snippets) on how to create a ZKP for a Sudoku puzzle solution using only hash-functions as commitments. ZKP can be quite daunting to understand initially, thus I truly believe that Sudoku puzzle is a good example to understand how ZKPs work at a very high level. Plus, Sudoku is something that most people are familiar with. But let’s just cut to the chase.

# 💡 Background Knowledge

## Zero-Knowledge Proofs

Zero-Knowledge protocols originate back to the 80s when they were first proposed by Shafi Goldwasser, Silvio Micali and Charles Rackoff in MIT [2]. More precisely, they described an *interactive proof system *that involves one party (‘Prover’) to prove to a second party (‘Verifier’) that a statement holds.

Contextualising this into the usual Alice/Bob example, we can consider the following scenario: Alice stumbled upon an online competition with a puzzle to be solved with a prize stake of £100. She asks for Bob’s help and they agree to split the price if any of them solves it. Not much time later, Bob claims that he has solved the problem and Alice asks him to share the solution. However, Bob is reluctant to sharing because he thinks that Alice may submit the solution by herself without sharing the price. Consequently, he is looking for a **secure** way to prove to Alice that he has the solution but without sharing it directly with her. Yeah, you guessed right! A ZKP is exactly what he is after!

But let’s define more thoroughly what is meant by the term “secure way”. In general, a ZKP must provide the following properties (at least with very high confidence!):

Completeness:Every invalid proof must be rejected by the verifier

Soundness:Every valid proof must be accepted by the verifier

Zero-Knowledge:Verifier does not learn any information about the statement other than its assertion.

Note that this is a very basic definition of interactive ZKPs. There a wide variety of more sophisticated interactive/non-interactive ZKPs but for the present walkthrough, this definition suffices.

## Commitment Schemes

Commitment schemes are a crucial ingredient of ZKPs and frankly, of cryptography in general. In simple terms, they are cryptographic primitives that allow a party to commit to a specific value without revealing the actual value while providing a way to reveal the value and verify the commitment in a later phase.

More formally, let **C **be a commitment scheme, it must provides these two properties:

Hiding: Hard to distinguish between commitments of different values. i.e:

Binding: There should be no way for a person who commits to one bit, to claim that he has committed to another value later:

One way to create a commitment scheme is by using one-way hash functions and cryptographically secure random bytes. It should be noted that in such case the security of the scheme is governed by the cryptographic assumptions of the hash function itself (i.e. it’s truly one-way).

To add more clarity, let’s walk through an example with the usual suspects. Alice and Bob decide to play a game of rock, paper, scissors digitally. They both make their choices and exchange them such that a winner is decided. Naturally, in a digital world, one of them has to share his/her pick first, which brings her/him in a disadvantageous position as he/she can just share a different choice after reviewing what the other player picked. This is exactly the kind of problems that commitment schemes solve!

Alternatively, they can create a commitment based on their choices and share the commitment instead of their actual choice! For instance:

Let a set:

S= {“Rock”, ”Paper”,”Scissors”}

Bob and Alice both randomly pick ** Pᴬ** and

**from**

*Pᴮ**S*respectively. Now they calculate: (

**||**-> represents concatenation)

Cᴬ= sha256(Rᴬ || Pᴬ)andCᴮ= sha256(Rᴮ || Pᴮ)

Bob shares *Cᴬ **with Alice and Alice *** Cᴮ **with Bob. Note that by now, they both committed to these values.

Finally, they share their original choices *and random bytes*** Pᴬ, Rᴬ** and

**With this information, each party can verify the commitment by hashing**

*Pᴮ, Rᴮ.***P || R**and assert their equivalence. Based on their picks the winner can be decided and none of them could have altered their initial choice since the hashes wouldn’t match.

# 🧩 Sudoku ZKP

It’s time for the main part of this article. Sudoku is a very well known puzzle that is also known to be an NP problem (in fact NP-Complete [4]) and is proven that there is a ZKP for any problem in NP [1]. Sudoku ZKP is by no means something new but I have yet to find an intuitive and clear explanation of the protocol with code examples so at the very least this is what this article aims to provide. Actually, the protocol described here is the implementation of this very interesting work from* Gradwohl et al. *[3]** **therefore** **for more formal details you can refer there. Interestingly, in their paper, they also described a physical protocol to perform the proof using a deck of cards which is fun if you want to demonstrate ZPK physically but let’s stick to the digital proof for now.

Before jumping into the code let’s see a high-level plain-English description of the protocol itself.

Alice wants to prove to Bob that she has a solution to a Sudoku Puzzle but Bob doesn’t believe her. Assume she on hold of the following puzzle and solution.

To avoid confusion let’s follow the proof in steps:

1. Alice creates a permutation of the sudoku digitis which effective is a one-to-one mapping for each digit. i.e. 1 -> 3, 2 ->8 ….

2. Additionally, she generates a random byte sequence (nonce) for each sudoku cell. This leads to 81 random nonces.

3. She applies the mapping on the numbers of her sudoku solution to obtain a masked solution. Note that by permuting the values the numbers still appear just once since it is a one-to-one mapping.

Alice splits the masked solution to sets of numbers from each Row, Column, Subgrid, and a set of the known numbers that were part of the puzzle definition in the first place. In effect, she ends up with 27+1 sets of numbers where the 27 derived from the rows, columns, subgrids must satisfy the sudoku requirement. i.e. All digits from 1–9 appear just once.

Then, she creates a commitment to the masked solution by hashing each cell with the corresponding nonce, sends that commitment to Bob and asks Bob to choose a row, columns, subgrid or to the set of known numbers.

Bob picks one and Alice sends him the nonces and permuted numbers that correspond to Bob’s choice.

In the case that Bob has picked the list of known numbers Alice also sends him the one-to-one mapping, she generated initially.

Bob then verifies that the permuted values indeed appear just once, and recreates the commitment using the nonces to verify Alice’s commitment.

In the case that Bob has picked the list of known numbers he also checks that the mappings are also correct. i.e. Let M be the mapping and n an integer 1–9: M(n) == Received List of known numbers.

These 9 steps summarize one iteration of the proof!

It is natural that there already are question marks been raised in your head about these steps so follow along for the rationale behind those steps.

💡 Steps 1, 3 are executed such that the *Verifier* does not learn anything about the solution of the puzzle.

💡Step 2, the nonces are used to stop the Verifier from performing a known plain text attack. Without nonces, Bob could just hash all numbers (1–9) and match the corresponding digest with the commitments to reveal the values.

💡 Step 5, by creating a commitment and sending it to Bob, Alice commits to her solution and therefore cannot change it (or the mapping).

💡 Steps 6,7,8,9 are the hardest to understand. By letting Bob choose any of those Alice provides a chance to the Verifier to confirm that the solution is correct. Recall that she can’t change the mapping since she committed to it. Additionally, you may be thinking:* “This proves that she may have just a solution, not that particular one”*. You are 100% correct! That is the reason that the choice of requesting the initial numbers given with the puzzle exists. That way, Alice cannot cheat because at any time Bob may choose to request those numbers and obviously they won’t match if it’s a different solution.

💡More importantly, it is necessary to stress the soundness error of this approach. Recall that soundness is the property of ZKP Verifier to accept every valid proof. In this case, since there are 28 choices (3n+1) that the Verifier can pick from, this means the protocol is only **1/28 % sound**, which implies a ** 1–1/28 soundness error!** In other words, after one iteration the Verifier is only 1/28 sure that this is a valid proof. That’s not very high, is it? For that reason, the steps above can be performed for many iterations to decrease the soundness error to a negligible value. (Bayes Rule can be used to infer the soundness rates per iteration)

I hope this makes more sense now. Let’s proceed to develop a small PoC proof using Python. You can find the full code here:

# 💻 Python ZKP PoC

Since generating and solving the sudoku is not in the scope of this post I will just use a static one for the time being but feel free to replace this with code to generate a new sudoku puzzle each time:

The sudoku puzzle is stored as a flat python list for easier processing.

Next, let’s add the code for permuting the sudoku solution using the one-to-one mapping and for creating the random nonces:

The zero element, is added after the shuffle because I don’t want it to be part of the mapping. It’s just there such the index of the numbers starts from 1.

In addition, below is the function to create a commitment of a sudoku solution:

As shown above, for commitments the SHA256 hashing algorithm is used. Furthermore, below is the code for split the puzzle into rows, columns and subgrids:

Finally here is a check to verify that all numbers from 1–9 exists and appear just once:

Now we can stitch all these functions together to create a Proof of Concept of the protocol and verify it’s correctness:

# Improvements

The code is just for demonstration purposes and is may not be cryptographically secure so please don’t use as-is in production. There are several optimizations and improvements that could be added so if you ever feel the urge, don’t hesitate to open a Pull Request in the GitHub repo.

# Conclusion

Hopefully, this blog post provided some general basic insights about ZKPs and more particularly how to prove possession of a Sudoku solution with a Zero-Knowledge proof. In future posts, we will explore more sophisticated proofs that do not require interaction between the Prover and the Verifier and also employ more complex cryptographic primitives. Stay tuned!

# References

[1] Oded Goldreich, Silvio Micali, and Avi Wigderson. *Proofs that yield nothing but their validity or all languages in np have zero-knowledge proof systems*. Journal of the ACM(JACM), 38(3):690–728, 1991.

[2] Shafi Goldwasser, Silvio Micali, and Charles Rackoff. *The knowledge complexity of interactive proof systems*. SIAM Journal on Computing, 18(1):186–208, 1985.

[3] Ronen Gradwohl, Moni Naor, Benny Pinkas, and Guy N Rothblum. *Cryptographic and physical zero-knowledge proof systems for solutions of sudoku puzzles*. In International Conference on Fun with Algorithms pages 166–182. Springer, 2007.

[4] Takayuki Yato and Takahiro Seta. *Complexity and completeness of finding another solution and its application to puzzles*. IEICE transactions on fundamentals of electronics, communications, and computer sciences, 86(5):1052–1060, 2003.