The Diffie-Hellman Key Exchange Protocol, simplified

… or: how to share a secret.

Shaaz Ahmed
The Software Firehose
7 min readMar 15, 2018

--

The Diffie-Hellman family of protocols is widely used to make insecure channels secure

The Diffie-Hellman key exchange has been receiving a lot more attention since its use for implementing end-to-end encryption on WhatsApp, using the Signal Protocol. One of the components of the Signal Protocol is a slightly more advanced version of the Diffie-Hellman concept we will discuss below, called the X3DH or Triple-Extended Diffie-Hellman. It will be significantly easier to understand X3DH in the next post once you’ve understood the basic underlying concepts as described below.

Many schemes of encryption of data between two parties is based on the idea that both parties know a shared secret key that no one else knows, and both of them can use this key to encrypt messages before sending them and decrypt messages from the other party. Such algorithms which use the same key for encryption of plain-text and the decryption of cipher-text (i.e. the encrypted text) are called symmetric-key algorithms in cryptography.

However, something about this puzzled me initially. How did these two parties get the shared key in the first place? Surely, they can’t just send it to each other over the network — the first time one of them sends it to the other, the message is not encrypted and hence anyone listening in to the network can intercept that message and get hold of the key. This looks like a chicken-and-egg problem — how can you use a symmetric-key encryption scheme in a network that’s not already encrypted? This is where the Diffie-Hellman exchange comes into play.

Before discussing the Diffie-Hellman exchange, I’ll summarize the approach this post takes. This post will be split into two parts — in the first part, we will discuss the basic idea behind the Diffie-Hellman key exchange in an extremely simple manner that is accessible to people without any previous knowledge of cryptography (e.g.: me when I started writing this). In the second part, we will discuss more practical examples, including the more sophisticated Triple-Extended Diffie-Hellman (X3DH) protocol.

A Simplified Explanation: Creating > Sharing

The Diffie-Hellman exchange is a key agreement protocol — a protocol that is specifically designed to help you create a secret key that both parties have, but don’t need to send across the network to share it. How is that possible? Thanks to a few neat mathematical tricks, both parties can generate a secret key using their own secrets without ever explicitly sending it to each other over the network.

Example

Let’s take a trivial example that outlines the idea behind the protocol, with many practical details omitted.

You want to send a message to your friend Alice. However, you don’t want anyone apart from Alice to know what the contents of the message are. You’re using the Caesar cipher, a simple encryption technique in which each letter in your address is replaced by a letter that is a fixed number of positions down in the alphabet. This fixed number of positions is the shared key — if Alice has this key, she can decrypt your message. (Note: the Caesar cipher is an outdated encryption scheme, and can be broken very easily — we only consider it as a simple example)

Instead of sending a key over to Alice so that you two have a shared key, follow the following steps.

Step 1: Pick two prime numbers and send them to Alice
First, you pick two prime numbers g and p, and send it to Alice. For example, g = 3, and p = 19.

Step 2: Alice computes A using her own secret key, and sends A to you
To create the first part of the key, Alice picks her own secret number a (say, a = 4) and does the following computation:

g^a mod p = 3⁴ mod 19= 81 mod 19= 5

Alice then sends this number A (= 5) across to you. (we chose to label it A because it’s derived from Alice’s secret a)

What is the mod operator? In the operation a mod b, the modulo operator finds the remainder after division of a by b. E.g. 10 mod 3 = 1.

Step 3: You compute B using your own secret key, and send B to Alice
Meanwhile, you pick your own secret number b (say, b =13), and do a similar computation:

g^b mod p = 3¹³ mod 19 = 1594323 mod 19 = 14

You then send this number B (= 14) across to Alice.

Now, you have the number 5 that Alice sent you, and Alice has the number 14 that you sent her.

Step 4: Alice computes the shared key

Alice now uses her secret a (= 4) and the number you sent her B (= 14)to compute the following:

14^a mod p = 14⁴ mod 19 = 38416 mod 19 = 17

Step 5: You compute the shared key

Similarly, you use your secret b (=13) and the number Alice sent you A (= 5) to compute:

5^b mod p = 5¹³ mod 19 = 1220703125 mod 19 = 17

Wow, both of you now have the same key — 17! And none of you had to send it to each other explicitly. Now, both of you can shift all the letters in your message by 17 letters (since we’re using the Caesar cipher) and encrypt your communication with each other, and shift it back by 17 letters to decipher it.

Now, there are probably two burning questions in your head right now:

i) How did this happen? Will both numbers always be the same?
ii) Why couldn’t someone intercept the numbers that we sent across the network (g, p, A and B) and simply use them to obtain the key?

Modular Exponentiation: how did this happen?

Let’s tackle the first question: how did this happen? The mod (modulo) operation has a nifty little property when it comes to exponentiation:

where a and b are the ‘secrets’ we used earlier. What does this imply? This implies that:

a and b are individual secrets, but the final result is a shared secret

Hence, you don’t need to know what a is and Alice doesn’t need to know what b is, but both of you know the value of the number g^(a*b) mod p.

The Discrete Logarithm: why is this difficult to break?

Now, let’s get to the more important question: why can’t someone simply intercept the numbers we send across the network and use it compute the final key? The difficulty in reverse engineering the shared secret key lies in the difficulty to compute the discrete logarithm. What is the discrete logarithm?

The regular logarithm in mathematics is the inverse operation of exponentiation — if 3² = 9, then the log of 9 to the base 3, is 2. The discrete logarithm is very close to this logarithm. When we compute 4² mod 3 = 1, how do we find the value 1 (the shared ‘secret’), if we only knew the values of 4 and 3 (which are the only two numbers we can get from snooping on the communication)? I.e., how do we find y in this equation:

Remember, the modulo operator is the remainder after division of 4^x by 3. So, if n was the quotient of that division,

Now, an attacker may find out y by listening in to our communications (remember, Alice sends A to you and you send B to Alice — these are equivalent to y). Even if we assumed that n = 1, there’s a range of possibilities for the values of x and y that will satisfy this equation. However, we require x to be an integer — i.e. a discrete number. The logarithm which has a discrete value (an integer, and not a decimal number) is called a discrete logarithm, and this makes the problem a lot harder, except in some specific cases. If we carefully avoid the special cases where it is easy to compute the discrete logarithm, there is no general and efficient way to compute them and makes them suited for use in cryptography.

Additional notes

The above explained scheme has many weaknesses, some of which we will discuss while introducing the X3DH protocol in Part 2. Note that, X3DH and more modern implementations don’t use the modulo operator, but operations that are even more difficult to compute the inverse of — such as the elliptic curves Curve25519 and Curve448. Also note that, although we’ve used g and p as two prime numbers, they are usually prime generators, or functions that can generate prime numbers.

Bibliography

[1] The X3DH Key Agreement Protocol, Signal
[2] “Diffie Hellman Key Exchange” in plain English, Stackexchange
[3] MathJax Display Engine
[4] The Double Ratchet Algorithm, Signal

--

--

Shaaz Ahmed
The Software Firehose

Every reader should ask himself periodically “Toward what end, toward what end?” — but do not ask it too often lest you pass up the fun of programming. — Perlis