# Secret Sharing Explained

## The primitive behind secure multi-party computation

Secure multi-party computation (MPC) is a cryptographic technique that lets multiple parties jointly and securely compute a function while keeping their inputs private. At Dropout Labs, we’re using this technique among others to build secure, privacy-preserving machine learning.

The primitive that powers MPC is the ability to split a piece of data into multiple encoded parts known as *secret shares*. On their own, the shares reveal nothing about the original data. However, if two parties perform the same operation on a set of shares and then recombine them, it is as if that operation was performed on the original data.

### How does it work?

The quick and easy explanation is that it involves splitting data into two pieces (or *shares)*, both parties do work with those shares, and then recombine those shares to get a result. Sounds simple? In practice, things get a little more complicated and often parties will need to communicate a few times during a computation, sometimes with a third party.

This post will focus on two common mathematical primitives needed in deep learning: addition and multiplication. More complex operations (e.g. division and comparison) are possible but more complicated. For example, trying to compare which number is bigger requires each party to communicate several times. While addition and multiplication seem trivial, you can use them to construct convolutional, pooling, and dense layers. We can even create various activation layers using approximations. This is enough to perform many computer vision tasks. There is even research being done into using 1-dimensional convolutions for Natural Language Processing (NLP) tasks, which could open up another branch of AI to privacy-preserving computation.

Let’s first look at this on a small scale, first a little notation and context:

- We will have two parties (
*P0*,*P1*) and one helper (*P2*). - Secret shares will be shown with some variable name and then a number signifying which half they are. e.g.
*a_0*and*a_1*are shares of some variable*a,*or*b_0*and*b_1 are shares of b.*

### Creating Shares

Before we can do anything with our shares, we need to know how to make them. It turns out that this is quite easy! To generate secret shares, you simply split the number you want to turn into two values. For example, a *5 *could be split into a *3* and a *2*, or an *8* and -*3.* This is done by generating a cryptographically secure random number, and then subtracting what you are trying to share.

### Using the Shares

Each party will exchange half of their shares with the other. They will then operate on their shares and exchange their result, then recombine their shares into a final answer.

In a very simple example, let’s say that *P0* has the number *5 *and *P1* has the number *8*. *P0* can create shares *a_0 = 3*, *a_1 = 2*. *P_1* will make shares *b_0 = 9* and *b_1 = -1*. Next, each party needs to swap half of their shares. *P0* will send *a_1 *to *P1*, and *P1* sends *b_0* to *P0*.

Each party now has the following:

As you can see, *P0* has no way of figuring out the value of *b *as they don’t have access to *b_1. b_0 *is the number 9, but the other half could be any number.

### Addition

An addition is the simplest operation we can perform with secret shares. Each party adds their shares and then exchange results. Using the above example, *P0* would sum their shares and get *12*, while *P1* would get *1*. Recall that the original numbers were *5* and *8*. *12 + 1 = 13*, *5 + 8 = 13*.

More formally, addition can be described as follows:

a + b = (a_0 + a_1) + (b_0 + b_1)

We can rearrange this formula like so

a + b = (a_0 + b_0) + (a_1 + b_1)

*P0* will solve *a_0 *+* b_0 *and *P1* will solve *a_1 + b_1*. This ensures that *P0* only gets a share of b, and *P1* only gets one share of a.

### Multiplication

Multiplication is more interesting and challenging because the parties need to communicate during the computation. We can use the same notation as above to define a multiplication using secret shares.

a * b = (a_0 + a_1) * (b_0 + b_1)

Recall from math class that the equation will expand to:

= (a_0 * b_0) + (a_0 * b_1) + (a_1 * b_0) + (a_1 * b_1)

We can see that *P0* can be responsible for *a_0 * b_0*, and *P1* can be responsible for *a_1 * b_1*.

The middle (*(a_0 * b_1) + (a_1 * b_0)*), however, presents a problem. Neither party can execute this term securely because it would require each party to have the other share. e.g. if *P0* wanted to solve for *a_0 * b_1 *they* *would need *b_1*, but they already have *b_0*, this would give them access to the value of *b*. We want to keep *b* secret from *P0*.

### Masking to the Rescue

The solution to this problem is something called *masking*. When we mask a share, we are introducing a new unknown number to each party that will disappear when the shares are recombined. To maintain privacy, we need a third party to generate these unknown numbers which will be used to obscure the data they don’t want to share with the other party. This means we will mask *b_1* from *P0* and *a_0* from *P1*. We refer to these masks as *s* and *t*, and *alpha* and *beta* as the masked values.

The multiplication of *a * b* performed by P0 becomes:

z_0 = st_0 + (s_0 * beta) + (alpha * t_0) + (alpha * beta)

While the multiplication performed by P1 becomes:

z_1 = st_1 + (s_1 * beta) + (alpha * t_1)

We will start by recruiting our third party (P2) to create some values for masking. P2 generates three new values and then splits them up into shares. The first two numbers will be random, and the third is the product of those two numbers. For our above example, let’s say P2 made values 7 and 11. This would make our third number 77. To recap, each party has the following data:

The way to utilize these values is to subtract them from the original data

alpha = (a_0-s_0) + (a_1-s_1) = -2

beta = (b_0-t_0) + (b_1-t_1) = -3

Note that *P2* will send the *s_0 *and* t_0 *values to *P0* and the *s_1* and *t_1* values to *P1*. *P0* then creates the left side of al*p*ha (*a_0-s_0*) and *beta (b_0-t_0)*, and *P1* the right side (*a_1-s_1*) and (*b_1-t_1*), respectively.

Next, *P0* and *P1* can exchange their alpha and beta shares without revealing any information about *a* or *b*. This is because the true values of a and b are hidden by the values given by P2. Now we are ready to plug these values into the above formula.

P0 calculates:

z_0 = st_0 + (s_0 * beta) + (alpha * t_0) + (alpha * beta)

z_0 = 44 + (4 * -3) + (-2 * 5) + (-2 * -3)

z_0 = 28

and P1 calculates:

z_1 = st_1 + (s_1 * beta) + (alpha * t_1)

z_1 = 33 + (3 * -3) + (-2 * 6)

z_1 = 12

Then we combine the results:

z_0 + z_1 = 28 + 12 = 40

Recall we wanted to calculate *5 * 8*. Voila!

### Conclusion

With the building blocks of addition and multiplication, you can support lots of machine learning models. Layers such as dense, convolution, pooling and approximated activation functions are all possible with these two primitives. This is enough for lots of computer vision algorithms, linear or logistic regression, and more!

If you are interested in exploring this area of research further, I highly recommend checking out tf-encrypted, which puts these concepts into practice to add a secure layer on top of TensorFlow. Morten Dahl has also made some great notebooks that you can use to play around with secret sharing in a more relaxed setting.

### About Dropout Labs

We’re a team of machine learning engineers, software engineers, and cryptographers spread across the United States, France, and Canada. We’re working on secure computation to enable training, validation, and prediction over encrypted data. We see a near future where individuals and organizations will maintain control over their data, while still benefiting from cloud-based machine intelligence.

Follow Dropout Labs on Twitter and star tf-encrypted on GitHub.

If you’re passionate about data privacy and AI, we’d love to hear from you.