Hands-On Disseminate Proof of Ownership Without Disclosing the Underlying Private Data
This article shows how to securely share data between organizations and provides a few examples in python.
Previously I wrote an article on how to make some computations on encrypted data without decrypting it using homomorphic encryption. While such an approach is great, it is still encryption, meaning that you still have to decrypt the final result to open the real value. Additionally, there is a risk of leaking the decryption key too.
Would it be cool if we can share encrypted data that computable while the decryption key does not exist at all so no one can decrypt and see our data ever?
Fortunately, there is even a whole field in cryptography that studies such problems and it is called Zero-Knowledge Proof. ZKP allows you to prove that you know some secret(s) to the others without actually revealing it. The term “zero-knowledge” tells that no (“zero”) information about the secret is revealed, while the second party (called “Verifier”) is mathematically convinced that the originating party (called “Prover”) indeed knows the secret.
The latest inventions in ZNP allow us to run complex computations over secrets. As of now, there are 3 main methods of how to run such computation namely SNARKs, STARKs, and Bulletproofs. Here is the link where you can compare them and see their details.
One of the main properties of SNARKs is to produce none - interactive proves, meaning you don't need a connected server or API. You can verify the correctness of computations offline.
In this article, I am using SNARKs and I skip all the math, focusing on how to use it instead.
Zero-Knowledge Proof Pipeline
For example, we want to prove that person is older than 18 years old. The age of the person is a secret. We want to generate a proof that the person is at least 18 years old.
SNARKs has the following implementation steps:
- Building a constraint system
At this step, you define all the variables and their computations, with all the constraints. Data variables could be private or public. Both the private and public variables are used to generate a proof, while the public variables are used later to verify the proof.
In our example, we have 2 variables: age(private), age_of_majority(public). Our constraint system is age ≥ age_of_majority = True, where ≥ is computation, and the equal sign (=) is the constraint of the computation result to the term True. - Keypair generation
After the constraint system is built we need to generate a pair of keys: proving key and verification key. Both keys could be public, with the difference is that proving key is used to generate proofs, and verifying key is used for proof verification. - Generate a proof
At this step, we take the proving key, constraint system, set all the variables {age; age_of_majority}, and finally generate the proof. - Verify the proof
Once we have the proof we can publish it, and everyone can check it against the verification key, constraint system, and the public variables {age_of_majority} for correctness. Verifier does not know the secret, but the verifier is confident that all the constraints have been met.
Zero-Knowledge Examples for Financial Industry
ZNP could be applied widely to a whole range of problems when secret data needs to be shared between organizations.
Let's look at two typical examples:
- Mutual fund verification. An active mutual fund manager wants investors to be confident by proving daily that the fund contains only certain stocks, and nothing else, so the risk is limited within the pre-defined published set of stocks. But the manager wants to hide the individual stock positions.
- Sufficient amount’s verification. A client has at least N dollars and her/his bank account without showing the actual balance.
I implemented these examples in Python using SNARKs as a single Jupiter notebook. You can run them in your browser here in Google Colab, a free online Python sandbox, or you can download source code and run it locally.
Example 1. Mutual fund verification
An active mutual fund manager wants to prove that the fund contains only certain stocks, eg. { GOOG, AAPL }. The manager wants to hide individual stock positions.
First, let's build the constraint system against the daily closing stock and fund prices, so the manager wants to prove that:
partial_sum1 = goog_position * goog_price
partial_sum2 = aapl_position * aapl_price
price = partial_sum1 + partial_sum2
The variables goog_position, aapl_position, partial_sum1, partial_sum2 are private, and goog_price, aapl_price, and fund price is public.
This type of proof is known as inner product zero-knowledge proof.
We need to define variables and set constraints. A variable is defined by calling the method PbVariable(), once defined you need to add it to the SNARKs execution context called the protoboard, and tell the protoboard if it is private (by default) or public ( protoboard’s method setpublic() ).
So our variables are the following:
# position variables
# they are private for the fund managergoog_position = libsnark.PbVariable()
goog_position.allocate(pb)aapl_position = libsnark.PbVariable()
aapl_position.allocate(pb)# partials
partial_sum1 = libsnark.PbVariable()
partial_sum1.allocate(pb)
partial_sum2 = libsnark.PbVariable()
partial_sum2.allocate(pb)# daily close price variables
# they are public and they will used by verifiersgoog_price = libsnark.PbVariable()
goog_price.allocate(pb)
pb.setpublic(goog_price)aapl_price = libsnark.PbVariable()
aapl_price.allocate(pb)
pb.setpublic(aapl_price)# fund closing pricefund_price = libsnark.PbVariable()
fund_price.allocate(pb)
pb.setpublic(fund_price)
Next, we need to define constraints. Constrains is SNAKs are called R1SC constraints. R1CS is a constraint satisfaction and it should be expressed as three linear combinations as such:
Linear Combination1 * Linear Combination2 = Linear Combination3.
Linear Combination defines a full linear combination which should include either a variable, a constant, addition, subtraction, or multiplication with a constant.
The method that creates R1CS is the following:
libsnark.R1csConstraint(Linear_Combination1, Linear_Combination2, Linear_Combination3)
R1CS examples:
1*1 = 1
libsnark.R1csConstraint(Linear_Combination(1), Linear_Combination(1), Linear_Combination(1))
(2*A+B)*C = D, where A, B, C, D are the variables
libsnark.R1csConstraint(
Linear_Combination(A)*2 + Linear_Combination(B),
Linear_Combination(C),
Linear_Combination(D))
Now we can define R1CS constraints for our example and add it to the current protoboard:
# partial1 = goog_price * goog_position
pb.add_r1cs_constraint(libsnark.R1csConstraint(libsnark.LinearCombination(goog_price), libsnark.LinearCombination(goog_position), libsnark.LinearCombination(partial_sum1)))# partial2 = aapl_price * aapl_position
pb.add_r1cs_constraint(libsnark.R1csConstraint(libsnark.LinearCombination(aapl_price), libsnark.LinearCombination(aapl_position), libsnark.LinearCombination(partial_sum2)))# fund price = (partial_sum1 + partial_sum2) * 1
pb.add_r1cs_constraint(libsnark.R1csConstraint(libsnark.LinearCombination(partial_sum1) + libsnark.LinearCombination(partial_sum2), libsnark.LinearCombination(1), libsnark.LinearCombination(fund_price)))
The variables and constraints are all that we need to generate and save a new proving and verification key.
cs=pb.get_constraint_system_pubs()
keypair=libsnark.zk_generator(cs)
libsnark.zk_write_keys(keypair, "fund.vk", "fund.ek")
Now the fund manager is ready to create a proof for the fund. Let's say the manager has 10 positions in Google and 20 positions in Apple. The stock price is respectively 1465, 437 USD.
# private variables
pb.setval(goog_position, 10 )
pb.setval(aapl_position, 20 )
pb.setval(partial_sum1, 1465 * 10 )
pb.setval(partial_sum2, 437 * 20 )# public variables
pb.setval(goog_price, 1465 )
pb.setval(aapl_price, 437 )
# 1465 * 10 + 437 * 20 = 23390
pb.setval(fund_price, 23390)
Now, the manager can create a proof and then save it
keypair=libsnark.zk_read_key("fund.ek", cs)
pubvals=pb.primary_input_pubs();
privvals=pb.auxiliary_input_pubs();proof=libsnark.zk_prover(keypair.pk, pubvals, privvals);
libsnark.zk_write_proof(proof,pubvals,'proof')
ls -lart----*** Private inputs: 10 20 14650 8740
*** Public inputs: 1465 437 23390
total 36
-rw-r--r-- 1 root root 2479 Aug 5 03:48 fund.vk
-rw-r--r-- 1 root root 10582 Aug 5 03:48 fund.ek
-rw-r--r-- 1 root root 1412 Aug 5 03:48 proof
Our proof size is quite small (1KB) so the manager can publish it on the webpage.
And finally, anyone can check our proof along with the public parameters using the verification key.
# public variables
pb.setval(goog_price, 1465 )
pb.setval(aapl_price, 437 )
pb.setval(fund_price, 23390)pubvals=pb.primary_input_pubs();verified=libsnark.zk_verifier_strong_IC(keypair.vk, pubvals, proof);print("*** Verification status:", verified)*** Public inputs: 1465 437 23390
*** Verification status: True
Sufficient amount’s verification Example
A Bank ABC’s client has at least N dollars on her/his account
The Bank will compare the client account balance (which is private data) with the requested threshold (public data) and generates the public proof.
This is an example of the comparison constraint
Our constrain system is private_amount — N ≥ 0.
But unfortunately, R1CS constrains could be built using only addition, substruction, and multiplication, and there are no less or greater operations.
But if we can present a number in a binary format, we will be only dealing with 1 or 0. Also, remember that any positive number has the highest bit is equal to 0 and it is 1 for all the negative numbers.
So to prove A<B for positive numbers, we calculate R = A-B.
We convert R into a binary string and check if the highest bit is 0, then R is positive then A≥B.
If the highest bit is 1 then R is negative then A<B respectively.
Our new constrain system for A<B will transform into:
A - B = R. It is the main constraint
R is the same as b_R. b_R is array {0,1} of 32 elements
b_R[31] = 1 # highest bit, meaning the value is negative
The first and the last constrains are easy
# 1. (a-b) * 1 = r
pb.add_r1cs_constraint(libsnark.R1csConstraint( libsnark.LinearCombination(a) - libsnark.LinearCombination(b), libsnark.LinearCombination(1),
libsnark.LinearCombination(r))# 3. b(R)[word_size - 1]*1 == 1
pb.add_r1cs_constraint(libsnark.R1csConstraint( libsnark.LinearCombination(b_r[word_size - 1]),
libsnark.LinearCombination(1),libsnark.LinearCombination(1)))
Let’s implement the binary constraints. We call the variable R as packed and b_R as unpacked_array (all elements are private).
packed = libsnark.PbVariable()
packed.allocate(pb)
word_size = 32# array of bin representation of the decimal value
unpacked_array = []for n in range(word_size):
v = libsnark.PbVariable()
v.allocate(pb)
unpacked_array.append(v)
Our LinearCombination becomes a sum of each bit at the n position multiple by 2^n where n=[0..31]. The sum should add up to the same number as the packed value. So the R1CS constraint simply checks if sum*1 = packed.
# bin to decimal calculation
packed_combination = libsnark.LinearCombination(0);for n in range(word_size):
packed_combination += libsnark.LinearCombination(unpacked_array[n]) * 2**n# add constrain packed == packed_combination
pb.add_r1cs_constraint(libsnark.R1csConstraint( packed_combination,
libsnark.LinearCombination(1),
libsnark.LinearCombination(packed)))
The next steps are the key pair generation, proof generation, and verification. The implementation is the same as in the previous example. Let’s assume the client balance is 1000 USD but the required threshold is 900 USD.
keypair=libsnark.zk_generator(cs)val_clients_balance = 1000
val_required_threshold = 900val_unsigned_r = signed_2_unsigned(val_required_threshold - val_clients_balance )
bit_array = bitfield(val_unsigned_r)print(bit_array)
for n, bit in zip(range(len(bit_array)), bit_array):
pb.setval(b_r[n], bit)pb.setval(r_unsigned, val_unsigned_r )
pb.setval(a, val_required_threshold )
pb.setval(b, val_clients_balance )cs=pb.get_constraint_system_pubs()
pubvals=pb.primary_input_pubs();
privvals=pb.auxiliary_input_pubs();proof=libsnark.zk_prover(keypair.pk, pubvals, privvals);
verified=libsnark.zk_verifier_strong_IC(keypair.vk, pubvals, proof);print("*** Public inputs: " + " ".join([str(pubvals.at(i)) for i in range(pubvals.size())]))print("*** Verification status:", verified)--------[0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
*** Public inputs: 1000
*** Verification status: True
I hope these two examples gave you an understanding of how to build similar solutions using ZNP.
I can see a great value of SNARKs in machine learning as well when models could be tested for the quality without revealing its coefficients.
I recommend to test this code in Google Colab since all you need to run it is a browser, but you are welcome to download it from GitHub.
Thank you for reading the article.