Secret Sharing Explained

The primitive behind secure multi-party computation

Secure multi-party computation (MPC) is a cryptographic technique that lets multiple parties jointly and securely compute a function while keeping their inputs private. At Dropout Labs, we’re using this technique among others to build secure, privacy-preserving machine learning.

The primitive that powers MPC is the ability to split a piece of data into multiple encoded parts known as secret shares. On their own, the shares reveal nothing about the original data. However, if two parties perform the same operation on a set of shares and then recombine them, it is as if that operation was performed on the original data.

How does it work?

The quick and easy explanation is that it involves splitting data into two pieces (or shares), both parties do work with those shares, and then recombine those shares to get a result. Sounds simple? In practice, things get a little more complicated and often parties will need to communicate a few times during a computation, sometimes with a third party.

This post will focus on two common mathematical primitives needed in deep learning: addition and multiplication. More complex operations (e.g. division and comparison) are possible but more complicated. For example, trying to compare which number is bigger requires each party to communicate several times. While addition and multiplication seem trivial, you can use them to construct convolutional, pooling, and dense layers. We can even create various activation layers using approximations. This is enough to perform many computer vision tasks. There is even research being done into using 1-dimensional convolutions for Natural Language Processing (NLP) tasks, which could open up another branch of AI to privacy-preserving computation.

Let’s first look at this on a small scale, first a little notation and context:

  • We will have two parties (P0, P1) and one helper (P2).
  • Secret shares will be shown with some variable name and then a number signifying which half they are. e.g. a_0 and a_1 are shares of some variable a, or b_0 and b_1 are shares of b.

Creating Shares

Before we can do anything with our shares, we need to know how to make them. It turns out that this is quite easy! To generate secret shares, you simply split the number you want to turn into two values. For example, a 5 could be split into a 3 and a 2, or an 8 and -3. This is done by generating a cryptographically secure random number, and then subtracting what you are trying to share.

A function to create secret shares in Python

Using the Shares

Each party will exchange half of their shares with the other. They will then operate on their shares and exchange their result, then recombine their shares into a final answer.

In a very simple example, let’s say that P0 has the number 5 and P1 has the number 8. P0 can create shares a_0 = 3, a_1 = 2. P_1 will make shares b_0 = 9 and b_1 = -1. Next, each party needs to swap half of their shares. P0 will send a_1 to P1, and P1 sends b_0 to P0.

Each party now has the following:

As you can see, P0 has no way of figuring out the value of b as they don’t have access to b_1. b_0 is the number 9, but the other half could be any number.

Addition

An addition is the simplest operation we can perform with secret shares. Each party adds their shares and then exchange results. Using the above example, P0 would sum their shares and get 12, while P1 would get 1. Recall that the original numbers were 5 and 8. 12 + 1 = 13, 5 + 8 = 13.

More formally, addition can be described as follows:

a + b = (a_0 + a_1) + (b_0 + b_1)

We can rearrange this formula like so

a + b = (a_0 + b_0) + (a_1 + b_1)

P0 will solve a_0 + b_0 and P1 will solve a_1 + b_1. This ensures that P0 only gets a share of b, and P1 only gets one share of a.

Multiplication

Multiplication is more interesting and challenging because the parties need to communicate during the computation. We can use the same notation as above to define a multiplication using secret shares.

a * b = (a_0 + a_1) * (b_0 + b_1)

Recall from math class that the equation will expand to:

= (a_0 * b_0) + (a_0 * b_1) + (a_1 * b_0) + (a_1 * b_1)

We can see that P0 can be responsible for a_0 * b_0, and P1 can be responsible for a_1 * b_1.

The middle ((a_0 * b_1) + (a_1 * b_0)), however, presents a problem. Neither party can execute this term securely because it would require each party to have the other share. e.g. if P0 wanted to solve for a_0 * b_1 they would need b_1, but they already have b_0, this would give them access to the value of b. We want to keep b secret from P0.

Masking to the Rescue

The solution to this problem is something called masking. When we mask a share, we are introducing a new unknown number to each party that will disappear when the shares are recombined. To maintain privacy, we need a third party to generate these unknown numbers which will be used to obscure the data they don’t want to share with the other party. This means we will mask b_1 from P0 and a_0 from P1. We refer to these masks as s and t, and alpha and beta as the masked values.

The multiplication of a * b performed by P0 becomes:

z_0 = st_0 + (s_0 * beta) + (alpha * t_0) + (alpha * beta)

While the multiplication performed by P1 becomes:

z_1 = st_1 + (s_1 * beta) + (alpha * t_1)

We will start by recruiting our third party (P2) to create some values for masking. P2 generates three new values and then splits them up into shares. The first two numbers will be random, and the third is the product of those two numbers. For our above example, let’s say P2 made values 7 and 11. This would make our third number 77. To recap, each party has the following data:

The way to utilize these values is to subtract them from the original data

alpha = (a_0-s_0) + (a_1-s_1) = -2
beta = (b_0-t_0) + (b_1-t_1) = -3

Note that P2 will send the s_0 and t_0 values to P0 and the s_1 and t_1 values to P1. P0 then creates the left side of alpha (a_0-s_0) and beta (b_0-t_0), and P1 the right side (a_1-s_1) and (b_1-t_1), respectively.

Next, P0 and P1 can exchange their alpha and beta shares without revealing any information about a or b. This is because the true values of a and b are hidden by the values given by P2. Now we are ready to plug these values into the above formula.

P0 calculates:

z_0 = st_0 + (s_0 * beta) + (alpha * t_0) + (alpha * beta)
z_0 = 44 + (4 * -3) + (-2 * 5) + (-2 * -3)
z_0 = 28

and P1 calculates:

z_1 = st_1 + (s_1 * beta) + (alpha * t_1)
z_1 = 33 + (3 * -3) + (-2 * 6)
z_1 = 12

Then we combine the results:

z_0 + z_1 = 28 + 12 = 40

Recall we wanted to calculate 5 * 8. Voila!

Conclusion

With the building blocks of addition and multiplication, you can support lots of machine learning models. Layers such as dense, convolution, pooling and approximated activation functions are all possible with these two primitives. This is enough for lots of computer vision algorithms, linear or logistic regression, and more!

If you are interested in exploring this area of research further, I highly recommend checking out tf-encrypted, which puts these concepts into practice to add a secure layer on top of TensorFlow. Morten Dahl has also made some great notebooks that you can use to play around with secret sharing in a more relaxed setting.


About Dropout Labs

We’re a team of machine learning engineers, software engineers, and cryptographers spread across the United States, France, and Canada. We’re working on secure computation to enable training, validation, and prediction over encrypted data. We see a near future where individuals and organizations will maintain control over their data, while still benefiting from cloud-based machine intelligence.

Follow Dropout Labs on Twitter and star tf-encrypted on GitHub.

If you’re passionate about data privacy and AI, we’d love to hear from you.