Current state of MPC in privacy-preserving ML

Published in

Going Byzantine

8 min readSep 25, 2019

Motivation

Privacy-preserving computation is one of the hottest current topics in cryptography, which has become especially sought after with numerous leaks in recent years even by big, well-established organizations endangering the users’ privacy and even safety (especially where geopositional data is concerned).

Companies that receive and store public data in plaintext run the risk of being breached and the data leaking, but there is also a growing concern of FAANG (and possible future tech giants) being able to use the data accumulated over the years to sway public opinion on a massive scale and, e.g., manipulate elections.

In most cases user data is collected by businesses (at least, initially) for the purposes of targeted advertising and recommendation. For such tasks, companies use data processing pipelines constructed of various preprocessing tools and machine learning models. This leads us to believe that PPC tools tailored to machine learning would allow to fulfill the overwhelming majority of companies’ business goals in a way that not just protects user data better, but eliminates the possibility of a breach altogether, as at no time the companies would have the user’s data in plaintext.

While there are several other approaches to privacy-preserving ML, such as homomorphic encryption or functional encryption, multi-party computation gets the most attention in state-of-the-art research and, especially, implementations. The reason for that is that MPC is currently the most efficient technique of privacy-preserving computing (although it has its own efficiency struggles, which we will outline below). More than that, MPC protocols are based on much simpler cryptographic primitives than their analogues, mostly having to do with only finite field additions/multiplications with very limited utilization of more advanced protocols in certain parts, while, e.g., homomorphic encryption is based on a recent and complex R-LWE assumption.

One of our recent projects involved extensive research into various modes of privacy-preserving machine learning, and we decided to share some of our findings for those who want to start their own research into this field.

Multi-party computation protocols

Generally, MPC boils down to computing arithmetic circuits (i.e., circuits where gates are addition and multiplication and wires hold numbers) on shares of some secret data — only collecting some subset of all shares for a single data point would allow to reconstruct that point. While in machine learning these circuits may become quite large, the base procedure remains the same (with the exception of non-polynomial functions, which require additional tricks). Therefore, all MPC protocols can be fully described by procedures for computing additions and multiplications on data that is shared among several peers. Addition is usually simple (most of the secret sharing schemes are additive, so addition on shares comes naturally), but multiplication is complex and requires communication among participants.

Generally, MPC communication complexity rises sharply with the number of computing parties (with all other parameters, such as the level of security and adversarial model, being equal), so it is preferred to have as few participants as possible. For example, widely used protocols SPDZ and SecureNN are 2PC and 3PC, respectively. Without loss of generality, we will focus on the 2PC case to keep the math concise.

We will describe a basic MPC protocol, and then will outline some modifications to this protocol made for various purposes.

Basic MPC scheme

Suppose that we have an additive secret sharing scheme such that for any data value x, and finite field modulus q there are

In practice this is usually done by sampling the first share randomly and computing the second share. Then shares can be distributed between two computing parties, and as long as they do not collude, the initial data cannot be recovered. In this setting, computation of linear combinations does not impose additional overhead:

Multiplication is more challenging. Suppose that we need to get additive shares of a product

While the first and fourth terms can be computed by the parties locally, the second and third term mix shares of different parties.

To securely compute the shares of the product, the so-called Beaver’s trick is used. Beaver’s trick defines an oracle that outputs shares for a triple

(i.e., shares of two random numbers and their product). Then, the parties compute

and combine the computed shares to reconstruct (“open”)

Since a and b were sampled randomly, these new values perfectly mask x and y, but they can be used to compute the shares of the product:

One can easily check that they are indeed additive shares:

Note that opening masked values necessarily requires communication between computing parties.

Additionally, for each multiplication gate, a separate Beaver triple (which is a triple of shared secrets) needs to be computed. In some cases differences between different MPC protocols boil down to the protocol for generating these triples. For example, the SPDZ protocol uses a homomorphic encryption scheme to compute triples, while MASCOT uses oblivious transfer for these purposes. Finally, (Keller et al, 2017) propose to forego triples entirely, and instead use semi-homomorphic encryption (i.e., encryption that allows to add encrypted values) to compute mixed terms.

Improvements

Actual practical protocols build on top of this foundation to gain various useful properties.

Most protocols used in practice add MACs to ensure that values are not corrupted by an active adversary when these values are opened (this often happens, e.g., during multiplication). To prevent too much overhead, the protocols use a single MAC value, which is only revealed in the end. Commitments ensure that incorrect openings of values are detectable in the end of the protocol — although the computation result would be incorrect, the responsible party will be discovered.

Often machine learning-specific improvements are introduced.

As machine learning mostly involves vector computations, arithmetic on shared secrets is vectorized in a natural way — instead of numbers, computing nodes hold shares of tensors; instead of computing triplets of numbers, triplets of tensors are computed and used in the same way as Beaver triplets in the protocol above.

To implement fixed point arithmetic (as often model weights are floating point numbers), the numbers are represented as integers modulo a large prime, and truncated if their length is large.

Finally, computation of models such as neural networks requires computation of non-linear activations. While it is in principle possible to compute activations directly through addition and multiplication gates, in practice this is extremely inefficient. Authors propose separate activation function calculations, for example, (Mohassel and Zhang, 2017) propose to use garbled circuits for evaluating ReLU activations on private inputs, as well as using the f(x)=x² activation to compute activations with a single shared value multiplication.

Implementations

While for homomorphic encryption only solutions for general binary and arithmetic circuits are available (so any type of complex computation must be implemented manually), for MPC there are already comparatively mature integrations with popular ML/DL frameworks.

TF Encrypted allows for seamless transition between using standard TF and the privacy-preserving version — the only difference is that instead of the usual tf.Tensor TFE uses PrivateTensor. Tensor additions are essentially the same, and tensor multiplications require computing servers (typically, 2 servers) to exchange special “masked tensors” (which are an extension of Beaver’s trick for tensors). So, while data exchange during instantiation of private tensors and multiplication has to be implemented manually, other methods and utilities from TF work just as they do in standard local computation.

PySyft is similar to TFE, but also allows to use pyTorch as backend. Unlike TFE, it also allows provides out-of-the-box tools for differential privacy and federated learning. Differential privacy and federated learning are not privacy-preserving ML techniques, but rather techniques for aggregating user data that are anonymous by design. Although this is out of scope of this article, this Google blogpost provides a quick overview with a number of in-depth sources.

Security considerations

MPC could potentially present an alternative to homomorphic encryption for facilitating privacy-preserving machine learning computation. A significant advantage of the MPC approach compared to HE is the existence of developed toolkits specifically tailored to machine learning (in contrast to HE, where existing implementations only provide a low-level interface). Additionally, performance tests suggest that, while the computational overhead is still quite large, it is several orders of magnitude lower than in HE.

However, there are challenges that complicate practical use of SPDZ and similar protocols for ML:

Current MPC protocols have severe communication bottlenecks — essentially, each multiplication gate requires a round of communication from computation nodes. Considering that communication performance tests (presented, e.g., here) show significant slowdown even when performed on different sockets on the same machine (i.e., without latency), adding latency for each multiplication gate would likely make the system impractical. Although this can be somewhat remedied by applying multiplication to whole tensors (which is done often in practice), the problem will remain on a smaller scale.
Current MPC protocols employ strong security assumptions within the so-called “passive-but-curious” or “semi-honest” adversary model, which assumes that (in the concrete case of 2PC) only one of the computing nodes is corrupted at any time. This is highly restrictive in any setting, especially fully trustless networks — one of the parties in an ML computation (the data owner or the model owner) can simply bribe both computing nodes and recover shares of their counterparty’s data/model weights. Although there are MPC protocols that assume stronger adversaries, they are mostly concerned with ensuring the computation correctness, and do not protect from data leakage.
Most MPC protocols only support addition and multiplication circuits, and more complex non-polynomial operations (such as equality checks and comparisons) require additional techniques which increase overhead.

Conclusion

MPC protocols for privacy-preserving ML solve a very relevant modern problem, and, as such, R&D in this sphere is extremely active. Just in the last 2 years there were a number of significant publications and released implementations.

Although protocols and their implementations are (as of yet) unfit for fully trustless environments due to their strong assumptions on lack of collusion, they already can be used in certain limited scenarios.

We believe that any researcher in cryptography or DLT should be at least aware of this field, as the current rate of R&D implies that practical solutions will soon appear, opening a possibility for more ethical and transparent practices in handling of user data.