Towards Shorter Encryption Keys

This article goes into technical details concerning Elliptic Curve encryption.

Public and private key cryptography is a recent invention from the late seventies that did not occur to our creative predecessors. Instead, they relied on symmetric encryption schemes which requires both sides to know the encryption key.

The earliest of such schemes is Scytale, a device first mentioned in 7th century BCE by the greek poet Archilochus. In this scheme the secret was in the width of the stick on which a strip with text is rolled on.

Scytale (image from Wikipedia)

Since those times there was a series of innovations in the complexity of encryption, culminating in encryption schemes such as that of Enigma which was used in WW2 by the Germans.

However, symmetric encryption schemes required to physically send the encryption key by a human courier who must not be intercepted. This changed in late seventies, when several researches began working on what become known as public key cryptography. In 1978, three computer scientists Rivest, Shamir, and Adelman (RSA) announced an invention that allowed to split the encryption key into a public and private parts. The public key part was used to encrypt, and the private key part to decrypt.

The innovation was that even if someone manages to get a glance at the public key, it will not help him to decrypt secret messages. The idea is that I would tell my public key to all my friends, and if they want to send me a confidential message, they would use that key to encrypt the text.

At first glance, the dual public-private key idea appears to be an ideal solution for private communication. However, there is still a problem of how to communicate the key.

The public keys devised by the RSA scheme are long numbers that can be represented by a mix of numbers and letters in a slightly shorter space. But, even after that they are still too long. For comparison, a credit card number is 16 digits long which is already annoyingly long to read off. But, an RSA public key is at least ten times longer.

Here’s an example a public key that is stored in the .ssh/id_rsa.pub file on my MacBook:

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDZdYqaK6msN2pRrr1tMioVqnqSI5P4+o+74Db2eLTenaeCXm/TJERgYBWV1c0EU8cGVfvH/tRBoJWVuPXs6ml1WnDvcgWpCrR+BZ7mga7as8t0U+264YsQe9NpdnFjpTW9RmyY+RDCAPkF7qtzdGyCQY0PmsRESsJ9tp1amrEL6vyhZvdZr/wOXpPzgQ00/9heZESuFlC4lLFsEBkPhBPdalYL2vjdEbunYjFqFlp4yuyBAV8BIfP+YTjPhyEXu03Ue0sDI/FD4ayx92Kozt7XtAI7W54iG7ycW0jiihOLag9vRCLXOSFCF1cDfHW74xdcwws8lxnJ2B2zoiCAOISH

Since you can not read off the public key, nor type it, it must be transferred either physically through a USB key, or some electronic communication method such as email, IM, or Dropbox. Another way is to publish it on your website, Twitter, or Facebook. (That is the approach taken by Keybase.io.)

Although these are potential solutions, there is still a courier who must be entrusted to deliver a public key. Now the courier is not a human, “he” is in an internet server, a website, or an email gateway. But because the courier stands between you and all your friends, the courier is in the position to perform a “man in the middle” attack on you. He delivers his own public key to all your friends, and they wouldn’t know the difference.

An innovation was needed to allow for shorter keys so that it would be easier to communicate them. Such innovation came in the form of Elliptic Curve Encryption scheme which can offer the same level of confidentiality with much shorter keys. Unlike RSA which alternates numbers laid out on a large circle, this scheme alternates points arranged on a 3-dimensional donut or torus with (x,y) surface coordinates subject to a certain equation. This extra geometry and constraints make it computationally harder to deduce keys.

Now, instead of a key being a long number, the key is a point (x,y) with integer coordinates. These coordinates satisfy the elliptic curve equation of the form “y² = x³ + ax + b”. Note that all operations are computed “mod p”.

(Working “mod p” means that if a number is greater than p, then you can divide it by p, and use the remainder instead of the original number. It can be shown that all important algebraic properties are preserved in this case, provided that p is a prime number.)

The length of a key is the space required to represent these (x,y) coordinates. Since the scheme would work with different choices of parameters “a”, “b”, and “p”, several standardized parameter sets have been devised. The parameter set known by the name of “P-256” is as follows. The coefficient “a” is the number of negative 3. The number “b” is a long integer:

41058363725152142129326129780047268409114441015993725554835256314039467401291

The modulus “p”, which must be a prime, is a long number that can be computed using the formula 2²⁵⁶ - 2²²⁴ - 2¹⁹² - 2⁹⁶ - 1. It’s computed value is:

115792089210356248762697446949407573530086143415290314195533631308867097853951

That prime number has a special property that the number consequent to it is divisible by 4. In other words, (p+1)/4 is an integer. The importance of this fact will become apparent shortly. (Note that because all prime numbers are odd, a consequent number would be even and would be divisible by 2, but not necessarily 4).


Let’s continue with the project of estimating the length of a public key. In the “P-256” elliptic curve scheme all coordinates can be represented using just 32-bytes of space. Because each byte can have 256 values but the English language has less than 256 letters, we would need to represent it with more than 32 characters. The Base64 URL encoding allows to represent 64 possibilities with a single character. (Note that base64 URL encoding is not the same as base64 encoding.)

Using Base64 URL character encoding, the 32-bytes of a key can be represented as 43 characters. The JSON Web Key (JWK) standard into which you can export a public key, does just that:

Key (x,y) in P-256 elliptic curve, base64-url encoded

Since each coordinate is represented as 43 characters, the full (x,y) coordinate requires 86 characters. Although this is a dramatic improvement over RSA, it is possible to halve the space required. The trick involves a mathematical computation. Theoretically, knowing one of the coordinates allows to compute the other because both satisfy the elliptic curve equation y²=x³-3x+b.

To make such computation quick we calculate y given x, and not vice versa. However, plugging into the equation a value of a known x computes the value of y², not y which we need. So, how do we get the “square root” of y²? It’s not so simple, because, remember, all computations are done “mod p”.


Let’s take a little interlude into the math of modular arithmetic. Modular arithmetic involves integers that are arranged on a circle. Working in it can make you feel that you are in a bizarro world.

As an illustration, let’s take a circle with 11 positions, which corresponds to “mod 11” computation. (Imagine a grandfather clock with only 11 hours, instead of 12).

The square roots of 3 mod 11 are 5 and 6. This simply means that if you start at position 1 and go around the circle 25 steps, you will end in position 3. The same will happen if you go around the circle 36 steps.

Here are all the “square” numbers in a “mod 11” setup:

p = 11   
for (var i=0; i<p; i++) {
console.log(i+"x"+i, i*i % p)
}
0x0 0
1x1 1
2x2 4
3x3 9
4x4 5
5x5 3
6x6 3
7x7 5
8x8 9
9x9 4
10x10 1

Thefore, getting back to your “mod p” computation, we need a strategy to calculate y from y² which involves understanding cyclical behaviour. Luckily, some great minds already worked on this, and we get to use the results for free.

To easily calculate the square root of y², we use a theorem by the great mathematician Euler. This theorem, called Euler’s Criterion, states that raising a value to power (p-1)/2 equals either 1 or -1 mod p. Whenever we have in math a formula that can reduce some quantity to a unit, we try to split whatever quantity we have into parts on which the formula can be applied.

In this case, we use the fact that our chosen prime also satisfies that (p+1)/4 is an integer. Let’s see what happens when you raise y² to the power (p+1)/4. The resulting overall power becomes 2 (p+1)/4 which simplidies to (p+1)/2. Furthermore, that expression can be written in terms of (p-1) like this: 1 + (p-1)/2. Combined with Euler’s criterion, we get y¹, the square root that we were seeking.

So, the way to find that modular square root, is to simply raise the number to the power of (p+1)/4 in mod p arithmetic.


From what I have presented as Euler’s criterion, the above computation would give a second solution of -y. This is not entirely accurate, because the part I didn’t mention was that for values that have a square root mod p — such numbers are called quadratic residues in modular arithmetic and correspond to square numbers in regular non-modular arithmetic —the criterion formula guarantees to give the value of 1, and not -1.

On the other hand, we are solving a quadratic equation here, and there are indeed two solutions for factoring y², one positive and the other negative. In other words: -y * -y = y² and y * y = y². Therefore, given only the x-coordinate and a missing y-coordinate, we can’t determine which y-coordinate we are missing.

However, in modular arithmetic every number that is negative can be represented as an equivalent positive number. For example, -5 mod 11 is the same as 6 mod 11. (That is why, in earlier “mod 11” example we have observed that both 5 and 6 are square roots of 3 mod 11. Actually, it is 5 and -5 that are square roots of 3.) A negative number just means going backward around a circle; the same position can be reaching going forward a complement amount of steps.

This seeming complication of duality between positive and negative numbers is actually to our advantage. In computing, we would prefer to work only with positive integers.

To go from -y to it’s positive equivalent just requires to add p to it, so that it becomes (-y + p). Written in reverse it is the complement (p-y). What is interesting to notice is that one of the valid solutions must be even and the other odd. For example, if y is odd, then (p-y) must be even, because p is odd. Conversely, if y is even, then (p-y) is odd.

Thus the difference in sign translates to the difference in parity if we insist to have both solutions to the elliptic curve equation as positive numbers. The parity can be encoded as a boolean bit, and packaged together with the value of “x” which needs 32-bytes of space.

Since we are working with bytes, it will require an extra byte. The overall public key package with only x-coordinate has received a standardized name of a “compressed elliptic curve coordinate”. The exact format is that the first byte can store the value 2 or 3 and is followed by 32 bytes of the x-coordinate value.

Now that we understand the theory, let’s look at some code. A Javascript code sample to recover the y-coordinate given a “compressed” x-coordinate is given below (code from a Stack Overflow post):

const 
two = new bigInt(2),
prime = two.pow(256).sub( two.pow(224) )
.add( two.pow(192) ).add( two.pow(96) ).sub(1),
euler_exponent = prime.add(1).divide(4);
function decompress_coordinate( package ){
const
y_parity = package[0] - 2, // first byte must be 2 or 3
x_bytes = package.subarray(1),
x = new bigInt( x );

// y^2 = x^3 - 3x + b
var y = x.pow(3).sub( x.multiply(3) ).add( b )
.modPow( euler_exponent, prime );
  // If the parity doesn't match it's the *other* root
if( y.mod(2) !== y_parity ) {
y = prime.sub( y ); // y = prime - y
}
return {
x: x_bytes,
y: y.toUint8Array()
};
}

In summary, to communicate a public key, we must only send to another person a base64-url encoded string like this:

even,LdK0I_FQu8S3CKKzR6TuSj_BMiXyYyheMlAOvzgdbiw

It is still cryptic and is much longer than a credit card number, but we are closer to solving the public key communication problem. Where to go from here?

One of the approaches is to encode the coordinate as a list of words, instead of using a base64 URL encoding. The space will be longer, but it will not be cryptic.

Each byte of the 32-byte long coordinate value can have 256 possibilities, so the key can be communicated as 32 words, each out of a dictionary (or word list) with 256 words.

We can use a larger dictionary, which would require less words. A pair of bytes can have 65,536 possibilities, which would require a fairly large dictionary with words that are not commonly used. You can take a look at such dictionary word list, published by Electronic Frontier Foundation (EFF), which contains 66,666 words. Here are a few words from it:

angrily
angriness
anguished
angular
animal
animate
animating
animation
animator
anime
animosity
ankle
annex
annotate
announcer
annoying
annually
annuity
anointer
another
answering

You can see that the words “angriness”, “anguished” and “anointer” are not that good for our purposes. They are easy to misremember and the word “anointer” is not that common, easy to misspell.

In order to work with a smaller word list featuring only common words, we need to divide the key into smaller groups of bits than 16. In total we have 32*8 or 256 bits. If we divide the 256 bits into groups of 10 bits each, then we would need only 26 words, with each out of word list with 2¹⁰ or 1024 words.

We can still do better, by increasing the amount of bits until we reach a realistic size dictionary. With 13 bits per word we require a word list with 8192 words of which we would need to select 20 words.

The EFF website offers a dictionary word list with 6,666 words which is slightly too small. We could mix-in some words from the longer list, or we can look for an existing larger dictionary word list. The challenge for communicating a public key easily is, therefore, to create a word list that has common words.

This also means that the list must be tailored to the language spoken by users. No English words are common words to a person who does not speak English. For example, a Russian speaker should use only Russian words.

If two people are Russian native speakers and know English as a second language, then they can use a word list with an appropriate mix of Russian and English words.

Interestingly, dual language speakers have approximately twice as large a vocabulary, and therefore, they can use less words to communicate a public key. For example, a word list with 20,000 cross-language common words would allow to represent the public key with 19 words, instead of 20.

The company Peerio has an online passphrase generator that picks words randomly from a list of about 11,000 words. Here is a list of 20 random words that it generated: “beards film feat cutter swarm bouquet marker doorstep nabbed bites armour nibble scuba atheist wad stomp surfer ferret widely nab.”

Would you be able to communicate such a list on the phone, as easily as a credit card number?

Note that words can be used to encode a public key, only provided that both sides can agree on which large word list (dictionary) to use. (To convert words back to the key, simply replace each word with it’s index in the large word list).

In summary, we have seen that a public key can be the x-coordinate and a parity of y-coordinate, which collectively can be represented as words or as a Base64 URL encoded string. Can you think of another method? The book is not closed yet on this topic.