Benchmarking Paillier Encryption

and an open source Rust library

Published in

Snips Blog

8 min readFeb 2, 2017

with Mario Cornejo and Mathieu Poumeyrol at Snips.

Today we’re open sourcing our experimental implementation of the Paillier homomorphic encryption scheme, written in Rust by our small team at Snips working on various privacy enhancing technologies.

Testing its performance we are not only interested in the concrete numbers we can achieve, but also in the price we pay for using a modern language (spoiler: Rust performed as well as C in all tests).

In the name of reproducibility we also share all our benchmarking code, including instruction for how to launch it on a Google GCE instance.

Security disclaimer: this library is experimental and have not received the same security treatment as for instance OpenSSL or Sodium. Like all libraries tested here, no special attention has been paid towards hardened it against side-channel attacks, and doing so might change the numbers reported.

Private aggregation

The context of this work is secure computation, more specifically private aggregation. The basic idea is that we want to aggregate data from the mobile devices of a large set of users in such a way that their privacy is respected, by for instance guaranteeing that nothing is revealed about any individual.

We’ve already written a bit about that, and will go into more details in a future blog post, but for now just mention a few applications. One widespread use-case is analytics, which we may now do with strong guarantees of privacy. Another is machine learning, where a global model can be trained on user data without learning anything about the individuals.

And how does the Paillier scheme come into the picture? For the same reasons that we talked about secret sharing schemes earlier: its homomorphic properties allow us to compute on encrypted data! Specifically, given a set of encrypted values we can compute their encrypted sum without decrypting, and hence without learning anything about the individual values. But for this to be feasible we need encryption and decryption to be efficient on mobile devices, especially since some applications may require many values to be encrypted.

Why Rust?

Before looking at the benchmarks it is also worth spending a moment on why we chose Rust as the implementation language for most of our work on privacy enhancing technologies (PETs).

As a company focusing on both Android and iOS we have a growing list of languages to choose from, including Java, C#, Swift, Go, Rust, and C. However, supporting cross-device development as a small team made us opt for a single implementation language, in turn narrowing the list down to Swift, Go, Rust, and C. Wanting a modern language with safety features and a good build system further reduced it to Swift, Go, and Rust.

A bit arbitrarily we went with Rust, influenced to some extent by its reputation as being suited for embedded devices. But the question of course is then how much we pay in terms of performance for this choice.

Homomorphic encryption

To understand the benchmarks it is instructive to understand the computations the libraries have to perform. Without going into the details of the Paillier scheme, it can be seen as defining a mapping encrypt from pairs (m, r) to values c, as well as its inverse decrypt; here m is the message to encrypt, r is randomness used to hide the message, and c is the ciphertext.

Focusing on encrypt, which depends on a public key consisting of a generator g and a modulus n, a pair (m, r) is mapped to c as g^m * r^n mod n². Some implementations compute exactly that, using two modular exponentiations and one modular multiplication. Others take advantage of an optimisation that can be done by fixing g as n + 1: in this particular setting, the g^m component may be computed as n*m+1, meaning one modular exponentiation and two modular multiplications instead.

In a similar vein, the computation needed for decrypt comes in two algorithmic flavours: one natural and one optimised using the Chinese Remaindering Theorem.

For those preferring a more mathematical treatment, encrypt is an efficient isomorphism between Zn x Zn* and Zm* with m = n², which may be inverted efficiently given extra information contained in the decryption key. Considering the additive group Zn in the direct product we see the homomorphic properties enjoyed by the scheme, namely addition and scalar multiplication.

Arbitrary precision arithmetic

For the Paillier scheme to be secure, n in the above must be around 2048 bits long, meaning most operations will be on numbers of roughly twice that size due to the n² modulus. While some languages support such large integers natively (e.g. Java and Go), most use third-party libraries.

Unsurprisingly, we will see below that the choice of arbitrary precision library plays a dominant part in the overall performance, yet choosing one is more complicated than that. For instance, while GMP comes out as the overall top performer, its use also implies a more involved build process on devices, not to mention the fact that it comes under a GPL license that may make it impractical for inclusion in mobile applications.

Paillier libraries

In addition to our own Rust library running on top of either GMP or RAMP, we also picked a handful of open source libraries matching the languages mentioned earlier:

C: libpaillier from ACSC using GMP
Go: go-go-gadget-paillier using the native BigInt
Java: javallier from NICTA using the native BigInteger
Python: phe from NICTA using GMP

And, as a potential contester for quickly prototyping cryptographic implementations, we wrote a sketch implementation in Julia, using the native BigInt type (wrapping GMP).

However, some of these libraries implement the algorithmic optimisations and some do not, and as the following graphs shows this can have a big impact on performance numbers.

So, to get a fair competition, we implemented the optimisations for the libraries that didn’t have them already (see libpailler and go-go-gadget-paillier) and these are the ones we will be using below.

Performance on Google GCE

While our main focus is mobile devices, and despite the fact that we’re working hard to change this with Dinghy, it is arguably still more convenient to perform benchmarks on ordinary computers than on mobile devices. So that’s what we’ll do first; more specifically, we’ll do so on a Google GCE instance (an n1-standard-4 to be precise) to make the results easier to reproduce for anyone.

From the figure we see that our choices may be divided into three groups. Java is by far the slowest and would incur a non-negligible overhead if many encryptions are to be performed. Rust with RAMP and Go with its native BigInt give roughly the same performance, with Go taking a small lead. And finally, using GMP we get similar performance from Rust and Julia, with Python adding some noticeable overhead.

Moreover, if we zoom in and add the C library using GMP, we see that it doesn’t seem to matter whether you use C or Rust, as the latter was actually measured slightly faster. One explanation for them being so equivalent, beyond most of the computation being done in GMP, might lie in the fact that both compile to native code via the same toolchain (LLVM).

Performance on Raspberry Pi

Towards the goal of measuring performance on phones we next bench on a Raspberry Pi 3, allowing us to easily compare the native libraries with GMP.

We see that GMP is again the top-performer, with Rust having a small lead over C, and both allowing us to do around 4 encryptions per second per core when using the typical key size of 2048 bits. And when it comes to native libraries, Rust is now slightly faster than Go, the former allowing us to do 1.4 encryptions per second per core.

And roughly the same pattern repeats with respect to decryption, Rust with RAMP now appearing slightly closer to Rust with GMP.

The question of course is how well this translates to phones. For now we’ll leave the compilation and use of GMP for phones as future work and simply focus on Rust with RAMP; if the ratios do indeed transfer comparably to phones then this means a factor 2.8 drop in performance.

Performance on phones

Focusing exclusively on Rust with RAMP we finally use Dinghy to benchmark a Samsung Galaxy S7 and an iPhone 6.

We see that the performance on the two phones is significantly better than on the Raspberry Pi, allowing respectively 6 and 8 encryptions per second per core.

Conclusion

As for implementing this cross-device library, Rust has certainly given us not only good performance, but also portability as well as an easy build and test system. Moreover, we measured no performance loss compare to C.

As for Paillier encryption on phones, the numbers show that it’s doable but not as fast as we wanted it to be. Luckily, our system for private aggregation allows us to mitigate this to some extent, by for instance allowing us to pack several values into one encryption. Moreover, by precomputing the heavy operations in an offline phase (essentially computing encryptions of zero while in the background) we can reduce the online computation to a simple multiplication and make the experience much smoother.

Finally, more tools such as Dinghy are clearly needed in order to do proper testing and benching on mobile phones; if you know any we’d be more than happy to hear about them!