Fast distributed RSA key generation against malicious adversaries

In this blog post I’ll describe a new result of Tore Fredriksen, Yehuda Lindell, Valery Osheter and myself. The full paper is available here and will be presented at the Crypto 2018 conference.

We show a protocol which enables two untrusting parties to jointly generate an RSA key, such that the private RSA key is shared between the parties. No single party knows the private key. The protocol is secure against malicious behavior, meaning that it is secure even if the parties arbitrarily deviate from the protocol. We implemented the protocol and showed that its performance is an order of magnitude better than that of the best previous protocol for distributed RSA key generation. This is even more impressive since the previous protocol only provided weaker semi-honest security (namely, that protocol was secure only as long as all participants followed it, but not if some party decided to deviate from the protocol).

RSA and threshold RSA

Rivest, Shamir and Adleman, the inventors of the RSA cryptosystem.

RSA is the oldest publicly known public key cryptosystem. (If you want to learn about public key cryptosystems you can start with the Wikipedia article.) The public key of RSA is composed of a large composite number N, and a public exponent e (which is typically set to be equal to 2¹⁶+1). N is computed as the multiplication of two large prime numbers p and q. The private key contains p and q, and an exponent d which can be computed from p, q and e. Current levels of security require that p and q are each at least 1024 bits long, and preferably longer.

When one wants to generate an RSA key, he or she first chooses long random primes p and q, computes N as the multiplication of p and q, and then computes the private exponent d.

RSA is also widely used as a primitive in more complex cryptographic constructions such as distributed signature schemes, (homomorphic) threshold cryptosystems and even general secure multi-party computation (MPC). These complex applications are not in the client-server setting, but in the setting of several distrusting parties, and thus require the private key to be secretly shared between the parties. Namely, while the public key is known to everyone, no one knows the private key. Rather, two or more parties have shares of the key, and these parties can run together a distributed protocol for computing RSA decryptions or signatures. At no point in time is the private key known to a single party. This is useful if we do not trust any single party, or are afraid that attackers might compromise some parties, but not all parties.

A bottleneck in using threshold RSA is generating the key in a way which prevents any single party from knowing the private key. (Note that this precludes the simple RSA key generation that we described earlier.)

Distributed generation of RSA keys

Dan Boneh and Matt Franklin described in 1997 a protocol for distributed RSA key generation. There were followup works, and some implementations, but most of them only worked in the semi-honest model.

Distributed key generation is no easy feat. Even assuming the parties act semi-honestly, and thus follow the prescribed protocol, it is a slow procedure as the fastest known implementation takes 15 minutes for 2048 bit keys. For the malicious setting we are unaware of any previous implementation. Our new protocol, on the other hand, generates RSA keys in the malicious setting in an average time of 134 seconds with one core, and 39 seconds with four cores.

Our goal in this work is to enable two parties to efficiently generate an RSA key so that the private key is shared between the parties. If we denote the parties as Alice and Bob, then Alice learns shares p¹ and q¹, and Bob learns shares p² and q², such that p=p¹+p² and q=q¹+q² are primes, and N=pq. None of the parties has any other information about the shares of the other party.

The basic key generation protocol

The distributed key generation protocol of Boneh and Franklin, and all subsequent work, are based on the following steps:

  1. Alice and Bob choose respective random shares p¹ and p². (To ensure that p¹+p² is odd, Alice chooses an odd p¹ while Bob chooses an even p².) Each party keeps her or his share secret.
  2. Alice and Bob run secure computations of distributed trial divisions, to verify that p¹+p² is not divisible by small primes. Namely, for each prime number d smaller than some agreed upon threshold, they run a protocol to check if p¹+p² is divisible by d. If this is the case then they delete this candidate pair. (The first test with d=3 weeds out 1/3 of the candidates. The second test, with d=5, filters another 1/5 of the candidates, and so on.)
  3. Alice and Bob take two candidate pairs that passed the trial division test, (p¹,p²) and (q¹,q²), and run a secure distributed multiplication protocol computing N=(p¹+p²)(q¹+q²). They both learn the value of N, but no information about the shares of the other party.
  4. The parties run a secure distributed protocol which runs a biprimality test. That is, this protocol checks if N is a product of two primes. If this is the case then obviously p and q are the prime factors of N. If N does not pass the test then the parties keep choosing new shares, until they find two pairs of shares whose multiplication passes this test.

5. Alice and Bob run a short protocol for computing shares d¹, d² of the decryption exponent.

The multiplication protocol of Step 3 ends up being very computation intensive, and therefore it makes sense to run many trial divisions before computing it, in order to filter out pairs that will definitely not pass the test. On the other hand, although the trial divisions are cheap they provide diminishing returns, and therefore it typically makes sense to run these tests only with respect to the first few hundreds primes.

Our improved protocol

Our first improvement comes from observing that the methods that were previously used for doing the distributed trial divisions were based on computing exponentiations. We replaced these methods with a simple trial division protocol based on oblivious transfer (OT), since OT can be very efficiently be implemented using oblivious transfer extension. In addition, we replaced the multiplication protocol, which was also based on exponentiations, with the multiplication protocol of Gilboa, which can be implemented using efficient oblivious transfer extension.

With regards to malicious security, we note that (roughly) the only part of the protocol where a malicious adversary can cause damage to the correctness of the protocol is in the biprimality testing. This is in contrast to cheating in the protocols for trial division or in multiplication: cheating in those protocols might slow down the execution, but no incorrect results will pass the biprimality testing if that protocol is run securely.

Given this observation we can achieve malicious security in the following way:

  • Implement the trial division protocols so that they only ensure privacy, but not necessarily correctness of the result. (Therefore, we can implement these protocols more efficiently.)
  • Implement the multiplication protocol so that it ensures privacy against malicious adversaries. Here, we designed a new protocol, based on noisy encodings, to prevent selective failure attacks against the Gilboa protocol.
  • With regards to the biprimality test, we first run a very efficient protocol for this task which ensures privacy but not necessarily correctness against malicious adversaries. The rational is that most of the times the number that is input to the test is not the multiplication of two primes, and therefore the test will fail. Only if the test outputs that the input is biprime, do we need to verify the correctness of the result.
  • When a number passes the biprimality test, we run a slower version of the test which also ensures correctness. The parties must also prove in zero-knowledge the correctness of their operations in the protocol, and consistency with commitments to their inputs which they were asked to provide in the beginning of the protocol.

In terms of run time, the multiplication protocol proved to be the most computation intensive, due to the additional measures that are used to prevent a selective failure attack. The secure version of the biprimality test, and the zero-knowledge proofs, are efficiently implemented using rather large garbled binary circuits.