# How Practical is Somewhat Homomorphic Encryption Today?

## by Maeva Benoit and Morten Dahl @ Snips

The results presented here follow a series of experiments made at Snips around using various ways to privately compute a simple but essential building block in many statistics and machine learning systems: summing vectors. Being able to do so for vectors coming from several users not only allows us to do basic analytics but also to generate distributions used for Bayesian inference as needed in e.g. predicting transit patterns or making personalised suggestions for new places to discover.

For this specific blog post the focus is on solving this problem using a technology that has received a lot of attention and progress in recent years, namely that of somewhat homomorphic encryption (or SHE for short).

### Homomorphic Encryption

At its most basic, a homomorphic encryption scheme is like any other encryption scheme in that it allows everyone to encrypt data by using the public encryption key, while only allowing those who know the private decryption key to decrypt.

However, unlike traditional encryption, it also allows anyone to transform the encrypted values, even without knowing the decryption key. Specifically, using operations Add and Mult, it is possible to compute a new encryption of respectively the sum and the multiplication of two encrypted values without learning anything.

While this in principle is enough to compute any function on encrypted values, in practice there are some limitations. Most prominently is the fact that with the current schemes, any addition or multiplication operation adds a bit of noise to the resulting encryption, so that after a certain number of operations it is no longer possible to decrypt correctly. Multiplications in general add more noise than additions, but as we will see, we will also be limited by additions if we are not clever.

For our experiments we used the state-of-the-art SEAL library from Microsoft Research, which not only implements a homomorphic encryption scheme but also provides tools for optimising its performance through parameter suggestion and various encodings.

### Summing Vectors using Homomorphic Encryption

Homomorphic encryption seems like the perfect match for our application: each user encrypts the values in his vector individually, and sends this new vector of encryptions to the server. In turn the server uses the Add operation to pointwise sum the vectors from all user, ending up with a single vector of encryptions. This vector is then decrypted, revealing only the intended output to the data scientist.

But before looking more closely at the performance of this solution, one problem we need to fix is that of who is holding the decryption key. Specifically, we are going to assume a setup where all users encrypt their vectors under a single public encryption key, and that only a trusted third party different from the server knows the corresponding decryption key.

This means that despite having access to all encryptions, the server cannot learn any individual data since it doesn’t know the decryption key. And despite having the decryption key, the trusted party cannot learn any individual data since the only encryption it sees is the one with the final aggregated result.

### Experiments

Concrete numbers reported below is for summing just one element of the vectors. We assume that the time needed to sum all elements of the vectors scales linearly with the dimension of the vectors and hence can be deduced from the single element case. The computations were all performed on a 8x4 core 3.5 GHz Xeon machine with 32 GB of memory.

Note that the performance of the SEAL library depends on determining optimised parameters for the actual computation one wants to perform. One of these parameters is called poly_modulus in the library, and it determines the degree of the polynomial modulus of the space we will work in: R = Z[x]/ X^(poly_modulus) + 1. We will see that it has a huge impact on the performances we get, not only in terms of computation time but also in terms of the bit size of encryptions (whenever the degree doubles, the bit size quadruples).

#### First Method

As mentioned above, one way to do the summing is for the users to simply encrypt all elements of their vectors individually. They then send this vector of encrypted elements to the server, who can do the pointwise n term summations simply by considering each index in turn, where n is the number of users.

The thing is, with the homomorphic encryption scheme we use, the computation depth has a huge impact on noise growth and hence parameter sizes, which in turn increases the size of encryption and the computation time. Basically, we don’t want to compute a depth n-1 addition each time as this quickly results in increased parameter sizes (going from a 1024 to 2048 poly_modulus already at 24 users). As a result, we use a tree-like recursive method to reduce the depth, allowing us to switch from depth n-1 to log(n). This way, in the tests we conducted, we never needed to use the highest poly_modulus parameter in the library, which is 16,384.

The graph below shows the evolution of computation time versus the number of users, where a value of maximum 20 is associated to each user and using parameters corresponding to a poly_modulus of 1024.

We see that computing the sum for up to 8000 users, i.e. a summation of 8000 terms, takes less than one hour. Moreover, the required upload for each user is 12 KB of data (given by 2 * (1024+1) * 48 bits).

We also see that after around 9000 users the time is growing much faster than before. This matches the moment where the library, based on its simulation tools, suggests us to switch to larger encryption parameters (poly_modulus goes up to 2048). Thus, using these heavier parameters is slowing down the computation and increasing the upload size to 48 KB, but is necessary in order to be able to correctly decrypt the encryptions afterwards.

#### Second Method

One downside of the first method is that it forces users to also send any non-significant values they may have to the server. Indeed, if the dimension of the vectors is large then there may be a high probability that some of the elements are zero (or otherwise non-significant). In the first method those elements are still encrypted and sent by the users, and they are still added together by the server, which of course can’t tell that it is only adding zeros.

This is why in our second method we will try to avoid the useless data transmission and computation that comes with encrypting zeros: we want to encrypt and send only meaningful data. But if we just send it in a vector like before, the server won’t be able to know which encryptions to add together across users.

To fix this, we associate a label to each element, corresponding to the indexes used before. But still, we don’t want the server to know which labels are missing, since it might reveal some information about the users. So we also encrypt the labels, bit by bit, and each user now has to send both the encrypted label and the encrypted element for each (nonzero) entry of his vector. To compute the final output vector, the server goes through all possible labels in turn, and for each one computes the summation of the elements that have that label associated with them.

In order to filter elements for each specific label, an indicator function is applied which will map an encrypted label to an encryption of 1 when there is a match, and to an encryption of 0 otherwise. This Ind function operates on the encrypted binary decomposition enc_label of the encrypted label, and uses the unencrypted binary decomposition plain_label of the unencrypted label we want to match:

where

This means that to compute the resulting sum for one specific label, we have to evaluate a summation over all encryptions received from all users, with each term being a product of size_label factors and the encrypted value. To simplify matters, we consider only the case where each user sends a single pair, in which case the summation for each label simplifies to

As before, to optimise the computation we arrange both the product and the summation in a tree of depth respectively log(size_label) and log(n).

To see how this second version performs, we fix the label size to 3 (so that the final output vector will have dimension 8) and vary the number of users. We then again report on the efficiency of computing one of the elements in the final vector since this scales linearly with the dimension.

While in the first method, doing this computation only took a few seconds for 450 users, here it goes up to nearly 5 hours. Moreover, while doing the multiplications needed to filter the labels is very costly in terms of computation, the library also gives us larger parameters to use, which slows down the whole computation even more.

We can also try to vary a factor which is not relevant in the first version: the size of labels. Here is a table which shows how the computation time evolves for three users:

Again we can see that each time the poly_modulus parameter is increased, it makes the computation around 15 times longer. When it is not, it only grows approximately 3 times bigger from one label size to the next. This is really a huge factor in the time our computation will take, and it mostly depends on the amount and depth of multiplications we are going to do.

Finally, the upload size of course also goes up. Each user now has to encrypt the labels bit by bit, so if each label has size_label bits, we have size_label + 1 encryptions to send per value. Note that for both methods, the parameters varies depending on the total amount of data there is to compute. In a very simple case with 10 users and 8 values of which only one is nonzero, the first method makes each user send 8 * 12 KB = 96 KB of data, while the second method already requires him to send 4 * 192 KB = 768 KB since the larger parameters needed by multiplication makes the size of encryptions increase as well.

### Conclusion

It seems that it will always be better to use the first method presented above without multiplications: although we may send useless encryptions of zero, it will most certainly be faster. At the same time, the first method reaches performance that is not completely crazy, up to at least 8,000 users sending, say, at most 100 values each: users will have to upload around 1.2 MB each and the total computation will be around 4 days. By picking an encoding that packs several values together this would be improved.

Unfortunately, while the performance of the SEAL library, and SHE schemes in general, will likely improve over time and open up for very interesting applications, at this point in time it doesn’t beat other methods for secure computation (at least for this specific application).

As for the SEAL library itself, it allows us to use a good SHE scheme easily. Many functions to optimise our calculations are available, including the simulation functions we used here to choose accurate encryption parameters. Moreover, there are many interesting features: for example, we can follow the evolution of noise in encryptions, encode rational numbers, or use a CRT method to encrypt big numbers with smaller parameters. Overall, the use of the library simplified a lot the work we wanted to do here.

Thanks to Brieuc, Lounes and Ludovic for their help in this project.

If you enjoyed this article, it would really help if you hit recommend below and try our apps!