A Quick Look into Diffie-Hellman Key Exchange

Edwin Tunggawan
Cermati Group Tech Blog
10 min readMay 17, 2018

As an engineer at Cermati, I have a lot of freedom to explore different technical topics — and share it with my coworkers on casual office conversations. At Cermati, Diffie-Hellman implementations can be found in our infrastructure — OpenVPN and Nginx have implementations of Diffie-Hellman key exchange protocol inside.

I have been spending my time reading some mathematics articles and books recently, and I got interested to make my own version of an article explaining how Diffie-Hellman key exchange works — which is why I created this article.

Diffie-Hellman key exchange was introduced by Whitfield Diffie and Martin Hellman to solve the problem of securely determining a shared key between two communicating parties over an insecure network.

Martin Hellman (left) and Whitfield Diffie (right), image owned by Stanford News Service.

The algorithm utilizes some basic mathematical concepts such as prime numbers and modulo operations, which we’ll explain briefly.

Prime Numbers

Prime numbers is one of the basic concepts of mathematics we should have been familiar with. They, by definition, are natural numbers larger than 1 that can only be a product of 1 and its own value.

n is a prime number if and only if there’s no natural numbers i and j where
1 < i < n and 1 < j < n that fulfills the condition where i × j = n.

The set of prime numbers contains 2, 3, 5, 7, 11, 13, and so on.

2 is a prime number because there’s no natural numbers between 1 and 2, so there’s no value i and j between 1 and 2 that can fulfill the condition where
i × j = 2.

3 is a prime number. The only natural number between 1 and 3 is 2, therefore the only possible value for both i and j is 2. Given i = 2, j = 2, and n = 3, it doesn’t fulfill the condition i × j = n since 2 × 2 ≠ 3.

4 is not a prime number, as it has natural numbers i and j where 1 < i < n and 1 < j < n that fulfills the condition where i × j = n. Between 1 and n given
n = 4 we have natural numbers 2 and 3, and as for i = 2 and j = 2 we have the condition i × j = n fulfilled.

1 is not a prime number, as it doesn’t fulfill the requirements to be considered as one. Given n = 1, it’s impossible for i and j to be greater than 1 but less than n. 1 < i < n and 1 < j < n where n = 1 would imply that somehow 1 < i < 1 and 1 < j < 1 is true, which we can translate to 1 < 1 is true. It is a contradiction in the statement, hence the statement is false.

Therefore, the number n is a prime number if and only if n > 1 and n can be factored with number(s) other than 1 and n. If n > 1 and n can be factored with number(s) other than 1 and n, it’s a composite number.

The properties of prime and composite numbers are taken into considerations on the Diffie-Hellman key exchange to choose the value for the variables used in the calculations.

Modulo Operation

Modulo operation is an operation where the result is the remainder of the division operation performed with two given integers as operands. To make it easier, here are a few examples. Please note we’re using only integers in the division, and the remainders aren’t added to the results.

13 / 5 = 2
13 mod 5 = 3

5 / 2 = 2
5 mod 2 = 1

3 / 6 = 0
3 mod 6 = 3

9 / 6 = 1
9 mod 6 = 3

From the examples above, we can see the modulo operations’ results equal to the remainder of the division operations performed with the same operands.

An example of the modulo operation we perform every day is clock reading.

A clock has the numbers 1 to 12 positioned in a circular manner, and we can tell to which number’s direction the shorthand on the clock should be pointing given the hour of the day.

An illustration on the positioning of numbers on a clock’s surface.

If it’s 13:00, the shorthand should be pointing to the direction of number 13 mod 12 = 1 at the clock. If it’s 19:00, the shorthand should be pointing to the direction of number 19 mod 12 = 7 on the clock.

If it’s 12:00 or 24:00 (or 00:00), the shorthand should be pointing to the direction of number 12, as 0 mod 12 = 12 mod 12 = 24 mod 12 = 0.

Cryptographic Algorithm

It’s easy for people owning the network devices our data passes through to intercept our communication and tamper with the data. Cryptographic algorithms are designed to prevent people from spying on us and to keep the integrity of our communication system. Yet, cryptographic algorithms would require a piece of information shared between the communicating parties for them to be able to understand each other’s encrypted messages.

The shared information — which we will refer as cryptographic key or just key — should be decided by the communicating parties. The key is required for encryption and decryption process on the message to result in the correct ciphertext or plaintext.

An example of cryptographic algorithm is the Caesar’s cipher, which was supposed to be used by the ancient Roman army under Julius Caesar. It’s a simple algorithm that builds a circular structure using characters in the alphabetic system.

A B C D E F G H I J … U V W X Y Z

The next character from A is B, the next character from B is C, and so on all the way until Z. The next character of Z is back to A, and the cycle continues.

Caesar’s cipher works by shifting the characters in a message to the left or right by n during the encryption process, and shifting it back by n to the opposite side during the decryption process.

For example, we have the plaintext message “MEET ME AT LUNCHTIME” with key n = 8 that we’re going to encrypt using Caesar’s cipher.

MEET ME AT LUNCHTIME
n = 8
M -> U
E -> M
E -> M
T -> B
M -> U
E -> M
A -> I
T -> B
L -> T
U -> C
N -> V
C -> K
H -> P
T -> B
I -> Q
M -> U
E -> M
Plaintext : MEET ME AT LUNCHTIME
Ciphertext : UMMB UM IB TCVKPBQUM

To decrypt the message, we need to reverse the character shifting process by using the key n = 8.

By the way, did you notice that by using Caesar’s cipher with 26 letters of the alphabet we’re practically performing modulo operations with 26 points in the cycle instead of 12 as in the clock reading case?

An illustration on the positioning of 26 letters in alphabet on a circular disk’s surface.

Since the letters A to Z can be represented with numbers 1 to 26, we can formulate the character shifting in modulo operation as follows.

c = (t + n) mod 26

t would be the number that represents the plaintext character, while n is the key. c is the ciphertext character as the result of the calculation performed.

Caesar’s cipher is pretty simple to break by using brute force, so we’re only using it as an example to demonstrate how things work in the big picture.

Diffie-Hellman Key Exchange

As explained, a cryptographic algorithm requires a shared key between the sender and the receiver. The sender uses the key to encrypt the message, while the receiver needs the key to decrypt it. Without the shared key, they wouldn’t be able to understand each other’s messages.

Diffie-Hellman key exchange can be used together to securely exchange keys over an insecure network. It works using the concepts of prime numbers and modulo operations.

Let’s say Alice and Bob, the two communicating parties, are using Diffie-Hellman key exchange. Alice and Bob publicly decided values g and p to perform modulo operation. p must be a prime number decided by the two communicating parties, while g can be any number agreed by both parties.

The formula of Diffie-Hellman key exchange.

p must be a prime number to minimize the possibility of the dividend and the divisor having the same common factor, which reduces the number of possible resulting values from the modulo operation. If p is a composite number and has common factors with the dividend, the possible resulting value set size from the modulo operation wouldn’t be less than p. It means less possible values to try using brute force attacks on the algorithm

The following rules regarding exponentiation and modulo operations are used during the key exchange steps we’re going to perform.

A rule on exponentiation.
A rule on modulo operation with exponent dividend.

In our example, we’re going to use a small prime number value to make it easier for explaining the process. To be secure in real life cases, the key exchange should use a really large prime number value.

p = 13
g = 22

Both of them, without telling each other what their secrets are, decided their own secrets. a and b are the secrets for Alice and Bob respectively.

a = 23
b = 31

Alice then performs the following calculation and send the result to Bob.

i = 22²³ mod 13 = 3

While Bob performs the following calculation and send the result to Alice.

j = 22³¹ mod 13 = 9

Using Bob’s j, Alice performs the following calculation.

n = 9²³ mod 13 = 3

Using Alice’s i, Bob performs the following calculation.

n = 3³¹ mod 13 = 3

They now both have the shared key n = 3, which they can use as the key for the encryption algorithm they agreed to use beforehand. Someone who’s eavesdropping their communication will be able to get the numbers g = 22, p = 13, i = 3, and j = 9. But if the eavesdropper doesn’t know the values of a or b, it is hard to guess the value of n. The larger the value of p, the larger the range of numbers to be brute-forced by the eavesdropper to be able to decrypt the message they intercepted.

A Complete Flow

Let’s take a look at the complete flow of the communication between Alice and Bob. Alice wants to send Bob the message “HELLO” over insecure network, so Alice and Bob would perform the key exchange to get a shared key. We’ll go with g = 22, p = 13, a = 23, and b = 31 as we use in the example before. From there, we get n = 3.

Alice proceed to encrypt the plaintext “HELLO” using Caesar’s cipher with n = 3 into the ciphertext “KHOOR” and send it to Bob. Bob, upon receiving the message, understands how the ciphertext was supposed to be decrypted. He reversed the Caesar’s cipher encryption process using n = 3, and transformed the ciphertext “KHOOR” back to the plaintext “HELLO” before reading it.

Bob knows how to decrypt the message because he knows the key. But an eavesdropper tapping their communication would need to guess the encryption algorithm and the key used to decrypt and understand the message.

Depending on the protocol agreed by both Alice and Bob, the Diffie-Hellman key exchange can be performed multiple times during the communication so it’s possible that for every message sent a different key exists to decrypt it. This, while computationally more expensive than using the same key for the whole communication, forces the eavesdropper to perform brute-force attacks on every single messages sent during the communication flow since the key from the earlier message doesn’t apply to the current one.

Conclusion

We’ve taken a quick look into the basic concepts regarding the Diffie-Hellman key exchange and how it’s all working together to form the security protocol. This article is intended to be introductory in nature, so a lot of parts regarding the mathematical and computational complexity is not present here.

To better understand the key exchange protocol, we can look deeper into the mathematics of exponentiation and modulo operation rules to see the computational complexity of the operations performed.

--

--

Edwin Tunggawan
Cermati Group Tech Blog

If I’m not writing code, I might be reading some random stuff.