Prime Number Generation in Rust
Pretty Cool Stuff
Here, at Snips, we like to play with homomorphic encryption and we also like to do pretty cool stuff with secure computation in general. But, before computing products and additions on ciphertexts, a prerequisite in a public-key system is the efficient generation of public-key parameters. For instance, Paillier cryptosystem requires prime numbers p and q for an admissible RSA modulus n = pq.
People interested in privacy and security probably have heard about these prime numbers and their importance in cryptography. In general, the security of many public-key cryptosystems relies on the intractability of computational problems, which is basically, problems very difficult to solve (or at least that is what we think).
A Hard Problem
Many cryptosystems are based on trapdoor one-way functions. A trapdoor one-way function is a mathematical function that is easy to compute in one way but requires a secret information in order to compute efficiently its inverse. We can construct trapdoors one-way functions from hard number-theoretic problems like the integer factorization or just factoring. The formal definition of factoring is:
Given a composite integer N, the factoring problem is to find integers p, q such that pq=N.
As an example of a trapdoor for humans, let’s consider the following numbers p = 48611 and q = 53993. It is relatively easy to compute the product of both (2624653723) but having just the product it is hard to compute each factor just using paper and pencil. While this particular instance (10 digits) is easy for computers, if we had used a large p and q, then it would be hard also for computers.
So far, the largest number factored is 232 digits long (768-bit) by using hundreds of computers and two years. Using a home computer (Opteron 2GHz CPU), it would have taken 2000 years. Finding the prime factors for numbers the size of what we use for cryptography today (2048 bits) it is approximately four billion times harder than 768-bits.
Arbitrary-precision arithmetic
Before starting to generate prime numbers, we need to select an arbitrary-precision arithmetic library. This is required, because the numbers that we are going to manipulate might be pretty big (a 1024-bit number is an integer of 309 digits). It is needed to use a special data structure to support big integers. Usually, the Integer class manages 32 bits, meaning it can hold values up to 2³² (4,294,967,295) which for many applications might be enough; however, to achieve security we need to be able to use big numbers. We do not just need to store the numbers, but to compute operations like multiplications and exponentiations between them.
Due to the fact that Rust does not provide a BigInteger type out-of-the-box, we use Ramp as a Big Integer library. So far, Ramp is the most performant library.
Prime numbers in the wild
Prime numbers have been studied for a long time, and yet, there is no formula for generating them. The naive approach would be to generate a random number and check whether that random number is prime. Nevertheless, this approach can be improved by setting two constraints:
- Setting the least significative bit (LSB) to 1 to make the number odd (there is no other even prime besides 2)
- Setting the most significant bit (MSB) to 1 to force the output to be an integer of length exactly n.
It is possible to prove that Algorithm 1 is efficient and that we can generate a prime number in polynomial-time in n due to the fair distribution of prime numbers and the efficiency of testing if a number is prime. Using this idea is the correct way of generating random numbers and test if they are prime. The wrong way to find prime numbers would be to generate them at random and try to factor them.
Primality Test
The school method to test whether a number is prime or not, it is to iterate through all the smaller numbers and check if any of those numbers perfectly divide the prime candidate. We can immediately optimize our first attempt. First of all, we do not need to iterate through all the numbers, but just up to the square root, (the random number is forced to be odd).
Iterating through all the numbers is very slow, but we can take a shortcut by taking into consideration the fact that the probability of a random number being divisible by a small number is higher than being divisible a big number. Using a table with the first 2000 prime numbers is highly efficient in terms of computation to discard composite numbers. A number passing these tests will have a probability less than 0.5% of being perfectly divisible by a number higher than 17863, which sounds good, but not enough for cryptographic applications, fortunately, we still have room to improve.
A better approach is to use a probabilistic algorithm, which can be used to determine whether a number is prime with a given degree of confidence. Probabilistic algorithms are algorithms that employ randomness in order to reduce the complexity or expected running time.
A fast way to discard a composite number is using Fermat’s little theorem. Loosely speaking, the idea is that the values less than the candidate prime form a group. If the output of the algorithm is false, then the number is certainly composite, on the other hand, if the output is true, there is still a chance that the candidate is not prime.
The most known probabilistic algorithm used in practice is the Miller-Rabin test. It determines whether a candidate number is prime or not. The confidence of the algorithm depends on the number of iterations. Three iterations lead to a probability of failing to once in 2⁸⁰ which is considered a secure implementation. To put in numbers, the probability of failure is 800 trillion times less than the odds of winning the jackpot of EuroMillions.
Putting all together
Finally, we just put all together in one function that will execute the three tests. Trial division as an inexpensive way to discard a composite number with a small factor. Secondly, we compute Fermat’s little theorem to discard composite numbers with a big factor. Finally, to discard possible failures of the previous methods, we compute Miller-Rabin test.
Real-World Implementations
Let’s discuss the implementation of some of the big players in the market. They basically use the same idea, they generate an odd large random number, and then that number is tested if it is prime. Even though the principle is the same for all the libraries, each one has its own way to do the tests.
Apple’s Security Framework and Common Crypto rely on the corecrypto library. It is opensource and the prime generation is focused on RSA following the rules of ANSI X9.31. In order to eliminate the small primes factors, Apple’s implementation tries the first 250 small primes and then 16 iterations of Miller-Rabin.
BoringSSL is Google’s “fork” of OpenSSL. They started from scratch and now it is used in almost all Google’s projects. Their implementation follows a strategy similar than the one presented in this post, following recommendations from academia.
Java possess their own class to manage arbitrary-precision integers, the BigInteger class can generate a big random prime by instantiating the class. In this library, they compute a “cheap” pre-test with the first 12 prime numbers before trying Miller-Rabin.
A different approach is used by GNU Crypto. They try the first 1000 small primes and instead of using Fermat’s it uses Euler Criterion.
It is worth pointing out that the Miller-Rabin primality test is the important test to do. The NIST method for prime generation is that recommended by the NIST Federal Information Processing Standards (FIPS) 186.
Standards and Recommendations
Cryptography is a very hot topic and different countries have guidelines how to use it to protect their data.
Regarding the minimum number of rounds of Miller-Rabin testing when generating primes for RSA, the NIST recommends 5 rounds for 512 and 1024 bits. For 1536 bits, the recommendation is 4 rounds.
The current recommendations for the sizes of the prime numbers from different national agencies is more or less the same: The composite product should be 3072-bit as minimum.
For instance, in the US, the National Security Agency NSA in its NSA Suite B Cryptography recommends a minimum of 3072-bit modulus for RSA to keep secure classified information up to top secret. The French Agence nationale de la sécurité des systèmes d’information ANSSI states the minimum size should be 2048 bits. Nevertheless, the recommendation is at least 3072 bits. The Bundesamt für Sicherheit in der Informationstechnik BSI recommends 2048-bit modulus for 2016. From 2017 to 2021 on, the minimum recommended is 3072-bit modulus.