# AES: How the Most Advanced Encryption Actually Works

Governments, militaries, banks and corporations rely on it. It’s responsible for securing most, if not all of your personal and financial data. There are special CPU instructions for it. It is the only cryptographic algorithm approved by the NSA.

We’re of course talking about the Advanced Encryption Standard (AES), the most commonplace and secure symmetric encryption algorithm yet developed.

This will be the longest article we write for the time being 😊. We’re going to take a deep dive into how this algorithm works. Modern cryptographic algorithms are by no means trivial; they use concepts from advanced mathematics to achieve a high level of security. But we don’t expect you to be a mathematician.

While there are many articles about AES, most of them are either too technical or leave out important information. With this article we hope to strike the perfect balance between comprehensiveness and accessibility. However, we must admit that this article will require some level of comfort with math, as mathematical concepts are introduced here.

It must be stated that you should not try to implement AES by yourself in a production application, or risk a side channel attack. Use the many free cryptographic libraries that offer tested implementations of whatever algorithms you need.

# The Basics

First, a brief refresher on what AES is.

AES is not the name of the algorithm itself, but a title awarded by the National Institute of Standards and Technology (NIST) to the algorithm they deemed to be the de facto standard for encryption. The actual name of the algorithm is *Rijndael*, and it was selected by the NIST over a number of algorithms to replace the former standard, known as DES (Data Encryption Standard). Rijndael was approved by the NIST in 2001 and adopted by the US government in 2002. It remains the standard cipher used by the US government and institutions across the world.

AES is a *symmetric cipher*, which means that a single key is used to encrypt *and* decrypt the same data. AES can be performed with the following key sizes: **128 bits, 196 bits and 256 bits**. Generally, increasing the key size also increases the level of security. Rijndael works for any key size that is a multiple of 32 bits, but the NIST chose specific sizes (and other parameters) that balance performance and security.

It is also a *block cipher*, meaning that the data is divided into blocks before encryption. AES divides plaintext into blocks of 16 bytes (128 bits).

# Algorithm Overview

The gist of AES is this: we arrange each block of the plaintext into a 4x4 matrix and repeatedly perform a set of operations on it. We call each iteration a *round*, and we perform **10, 12 or 14 rounds** depending on the key length (this is another parameter chosen by NIST):

- 10 rounds for a 128-bit key
- 12 rounds for a 196-bit key
- 14 rounds for a 256 bit key

For each round, we generate a *round key *from the main key using the *Rijndael Key Schedule*.

There are four operations on the 4x4 matrix that we will define:

- subBytes()
- shiftRows()
- mixColumns()
- addRoundKey()

Not every round of operations is the same; for the first round, we only add the round key, and for the last round we omit the mixColumns() step. So the pseudocode for the AES algorithm might look something like this:

function AESencrypt(plaintext, key) {

blocks := divideIntoBlocks(plaintext);

roundKeys = getRoundKeys(key) for (block in blocks) { //first round

addRoundKey(roundKeys[0], block); //intermediate rounds

for (8, 10 or 12 rounds) {

subBytes(block);

shiftRows(block);

mixColumns(block);

addRoundKey(roundKeys[..], block);

} //last round

subBytes(block);

shiftRows(block);

addRoundKey(roundKeys[numRounds - 1], block); } ciphertext := reassemble(blocks);

return ciphertext;}

# Mathematical Background

This section may appear rather esoteric. We told you we would explain AES in detail and we weren’t lying. Luckily, we believe that the concepts in this section can be understood with only high-school level math and basic programming experience. Don’t let the fancy words throw you off!

Let’s introduce a concept in abstract algebra called a *Galois field* or *finite field*. A *field* is a set (meaning a collection of objects) with operations that act on the set and behave similarly to addition, subtraction, multiplication and division. In other words, the operations satisfy a number of properties that also hold true for addition/subtraction/multiplication/division over the rational numbers. In fact, the rational and real numbers with these four operations qualify as a field. The *finite* qualifier just means that the set has a finite number of elements.

We won’t go into the specific definition of a field. Just think of it as a set of numbers where addition, subtraction, multiplication and division are redefined.

AES uses a specific Galois field, which we will call *Rijndael’s finite field*, to perform many essential operations. In particular, it uses GF(2⁸) with irreducible polynomial *x*⁸ +* x*⁴ + *x*³ + *x* + 1. You’ll understand what this means in a bit.

The Galois field GF(*pⁿ*), where *p* is prime and *n* is a positive integer, denotes the field with *pⁿ *elements*. *For example, the field GF(8) (or GF(2³)) contains all integers from 0 to 7. An important property of Galois fields is that the elements of the field GF(*pⁿ*) are **all polynomials of degree less than n with non-negative coefficients, evaluated at p***. p* is called the *characteristic* of the field.

Let’s look at GF(8), or GF(2³), again. GF(2³) contains {0, 1, 2, 3, 4, 5, 6, 7}, which can be equivalently represented as {0, 1, 2, 2 + 1, 2², 2²+ 1, 2²+ 2, 2²+ 2 + 1}, or {0, 1, *x*, *x*+ 1, *x*², *x*² + 1, *x*² + *x*, *x*² + *x*+ 1} where *x* = 2.

With this notation out of the way, we can now explain how addition/subtraction/multiplication/division works with Galois fields.

**Addition/Subtraction**

Suppose we want to add elements *a* and *b* of GF(*pⁿ*). First, convert them to their polynomial forms, that is, write them as sums of powers of *p*. Then we add the polynomials together, but with a caveat: for each coefficient *a_k* in *a(p)*, *b_k* in *b(p)*, the resulting coefficient *c_k *is equal to *a_k* + *b_k* mod *p*. For subtraction, the formula is *c_k* = *a_k* — *b_k* mod *p*.

We apologize for Medium’s lack of support for mathematical expressions. Here are the formulas again in LaTeX:

“mod” of course stands for modulo, which in most programming languages is written as “%”. Let’s look at an example:

74 + 26 in GF(3⁴) = (2 * 3³ + 2 * 3² + 2 * 1) + (2 * 3² + 2 * 3 + 2 * 1)

= (2 * 3³ + 4 * 3² + 2 * 3 + 4 * 1) = (2 * 3³ + 3² + 6 + 1) = **70**

Notice that 4 * 3² became 3² and 4 * 1 became 1 because of the modulo operation on the coefficients.

The Rijndael field GF(2⁸), as well as all fields with characteristic 2, share a property that makes them well suited for computers: **addition and subtraction are equivalent to the bitwise exclusive or (XOR/⊕) operation.** This works because **each term in the polynomial represents a bit.** Here is another example to demonstrate this fact:

15 + 12 in GF(2⁴) = (2³ + 2² + 2 + 1) + (2³ + 2²)

= (2 * 2³ + 2 * 2² + 2 + 1) = (2 + 1) = **3**

15 ⊕ 12 = 0b1111 ⊕ 0b1100 = 0b0011 = **3**

Addition in Galois fields is often referred to as “carryless addition.” We add each set of digits independently modulo *p*. Note that in the example above, the first set of digits adds to 2 mod 2 = 0, while the second set adds to 7 mod 2 = 3.

**Multiplication**

In order to multiply two polynomials *a(p)* and *b(p)* in GF(*pⁿ*), we need to chose a third polynomial *m(p) *that *cannot be factored* and has a degree of at least *n*. We call m(p) the irreducible polynomial.

To multiply *a(p)* and *b(p) in *GF(*pⁿ*) we perform the following steps:

- Multiply
*a(p)*and*b(p)*like normal - Reduce the coefficients of the resulting polynomial modulo p.
- Reduce the entire polynomial mod
*m(p).*This is so our final answer stays less than*pⁿ.*

The final step can be performed using polynomial long division. **However, this is no ordinary division . Every arithmetic operation we perform during the process must be the finite field’s version.** In the case of GF(2⁸), every time we perform “subtraction”, we actually perform the XOR operation. Representing the polynomials as binary strings will be helpful for this step.

Let’s perform 193 * 56 in GF(2⁸) and with *m(p)* = *x*⁸ +* x*⁴ + *x*³ + *x* + 1 (Rijndael’s field):

If multiplication in finite fields seems complicated, don’t worry! With Rijndael’s field, the multiplication algorithm can be greatly optimized beyond what is described above. Below is a pseudocode implementation:

function gmul(a : byte, b : byte) { p : byte = 0x00;

for (8 rounds) { //if low bit of b is set

if ((b & 1) != 0) {

p = p ⊕ a;

}

//true if the high bit of a is set

h : bool = (a & 0x80) != 0; a = a << 1; //shift left

if (h) {

a = a ⊕ 0x1B; //value of the polynomial m(p) (0x11B) with the high bit removed

} b = b >> 1; //right shift return p; }

}

## The Multiplicative Inverse/Division

The multiplicative inverse of a polynomial a(p) is the polynomial b(p) such that a(p) * b(p) mod m(p) = 1. Multiplicative inverses can be found by applying the inverse of the algorithm above. Division is simply a matter of multiplying the first operand by the multiplicative inverse of the second.

**What’s the point?**

Why does AES borrow concepts from finite field theory. The main reason is performance. Remember that “addition” in GF(2⁸) is the same as the XOR operation. Also consider that there is no need to worry about overflow/underflow, because the inputs and outputs of the operations are restricted to the numbers 0–255.

Multiplication is less complicated than it looks. To put things in perspective, consider how computers perform normal multiplication at the lowest level: through repeated bit shifts and additions (of course, we don’t have special circuitry to perform GF(2⁸) arithmetic). Finally, multiplication by 2 and 3 is very easy to optimize. This will become important later.

//multiply by 2 in GF(2^8)

function gmul2(a : byte) { h : byte = a & 0x80; //high bit b : byte = a << 1; if (h == 0x80) b = b ⊕ 0x1b; return b; }//multiply by 3 in GF(2^8)

function gmul3(a : byte) { return a ⊕ gmul2(a);}

# Key Expansion

We’re done with background! Now it’s time to look into the nitty-gritty of what AES actually does.

Before any encryption takes place, separate 128-bit keys must be generated for each round. Rijndael uses a specific algorithm to generate round keys.

The Rijndael key schedule performs a number of operations:

**rotate():** rotates a 32-bit (4 byte) word 8 bits to the left. As an example: rotate(0xab157c9e) = 0x157c9eab.

**rcon():** exponentiates (repeatedly multiplies) 2 to a specified value in Rijndael’s finite field. This operation can be described by the pseudocode below. However, we will only need to use a maximum of 11 rcon() values, so a lookup table is sufficient.

function rcon(value : unsigned int) { c : byte = 1;

if (value == 0) return 0;

while (value != 1) {

c = gmul(c, 2);

value--;

} return c;}

We have not yet explained how exponentiation works for finite fields. In short, it works very similarly to normal mathematics. We encourage curious readers to do their own research if they want to learn more.

**sbox():** another operation in Rijndael’s field, usually implemented with the following lookup table:

Mathematically, sbox() is a two part operation. First, we take the multiplicative inverse of the input. The result is represented as an eight-element vector (one for each bit) and undergoes the following transformation:

The entire process is performed with Rijndael’s finite field arithmetic. So at the end, for example, the byte 0b11000110 is not added, but **XOR’ed** with the matrix-vector product. This transformation is more succinctly represented as a series of XORs and left bitwise rotations like so:

The sbox() operation heavily contributes to the security of AES as a whole. It is resistant to **linear cryptoanalysis** in that it is hard to approximate with a linear transformation. It is also resistant to **differential cryptoanalysis** in that there seems to be no correlation between how the input changes and how the output changes. Finally, sbox() is special in that there are no **fixed points**, i.e. there are no situations where the input equals the output.

The key expansion algorithm works mainly on 32-bit words instead of bytes. For the future, we will define **subWord() **as the application of sbox() on each byte of a word.

## The Key Expansion Algorithm

Onto the actual key-generation algorithm. Let’s define some constants before we take a look at the steps:

- Let K[0] thru K[N-1] represent the 32-bit words of the original key.
- Let N equal the length of the original key in 32-bit words
**(4, 6, or 8 for a 128, 192 or 256-bit key respectively).** - Let R equal the number of rounds
**(10, 12, or 15 for a 128, 192 or 256-bit key respectively).** - Let W[0] thru W[4R-1] represent the 32-bit words comprising all of the round keys. Let’s call this the expanded key.

The key expansion algorithm iterates through all of the 32-bit words W[0] thru W[4R-1]. When we want a specific round key, we combine the four words (128 bits) from of the expanded key corresponding to that round. For example, the Round 3 Key comprises of W[8] thru W[11].

The algorithm goes like this:

for (i from 0 to 4R - 1) { //The first words (W[0] thru W[N-1]) are equal to the words of the original key.

//For a 128-bit key, this means that the first round key is equal to the original key.

if (0 < i < N) W[i] = K[i]; //Perform operations on the last word of each N-length cycle before XOR-ing.

else if (i >= N and i == 0 mod N)

W[i] = W[i-N]⊕subWord(rotate(W[i-1]))⊕rcon(i/N); //For a 256-bit key length only.

else if (i >= N and N == 8 and i == 4 mod N)

W[i] = W[i-N]⊕subWord(W[i-1]);//Typical case

else W[i] = W[i-N]⊕W[i-1];}

The gist of the algorithm is this: generally, each word is the previous word XOR’ed with the word N places behind it. But every N words, we perform various GF(2⁸) operations on the word N places behind before the XOR-ing takes place.

# An AES Round, Step-by-Step

In this section we will take you through an entire AES round. Remember that the first round only contains the addRoundKey() step. We will be going over every step, so imagine that the first round has already passed and that we are now on the second round.

Let’s say we want to encrypt the following message:

The quick brown fox jumped over the lazy dog

We will look only how the first block is encrypted, containing “The quick brown ”. Rearranged into a 4x4 matrix, the block looks like this:

And here it is again in hexadecimal:

## subBytes()

This is the first step in the AES round. We perform the sbox() operation on each byte in the matrix (see the Key Expansion section for details).

## shiftRows()

For this step, we rotate each row to the left a number of spaces corresponding to the row number. The first row is shifted zero spaces, the second is shifted one space, and so on.

## mixColumns()

Now we multiply each column by the following matrix using GF(2⁸) arithmetic (gmul() and XOR instead of normal addition and multiplication):

The entire transformation looks like this:

Notice that this transformation involves many multiplications by 2 and by 3. This is where the optimizations mentioned at the end of the “Mathematical Background” section come into play.

In our example applying the transformation to each column looks like this:

Both the shiftRows() and mixColumns() steps add **diffusion** to the cipher in that they allow small changes in the plaintext to affect the entirety of the ciphertext.

## addRoundKey()

This is the easiest step. We XOR each byte in the block with its respective byte in the round key. Let’s assume that the round key for this round is:

abcdefghijklmnop

In hexadecimal this is:

61 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70

Keep in mind that this is a very unrealistic scenario. The round key should be pseudorandom, as well as the cipher key.

The result of addRoundKey() in our example looks like this:

Converted back to ASCII the result of a single AES round looks like this:

1SFÒ½Ë%Èm¸P«¯

As you can see, the original message is already ridiculously scrambled. Magnify this by as much as fourteen times and make sure the keys are random, and you have a level of security that not even the best computers can crack.

# Decryption

So we have our encrypted message. How to we reverse the long and complicated process that is AES? One of the great things about *AES is *that* every action has its inverse.*

Decryption works like this: we generate the round keys using the same process. Then we perform the encryption algorithm in *reverse* using the inverses of the various operations.

The addKeys() step is it’s own inverse. XOR-ing string *A* with string *B* twice simply results in *A*. We just have to remember to use the final encryption round key as the first decryption round key, the second-to-last as the second, and so on.

For the mixColums() step, we apply the *inverse* of the matrix described above to all of the columns. The transformation looks like this:

For the shiftRows() step, we simply rotate the rows in the opposite direction. Or alternatively, rotate each row in the same direction a different number of spaces. If a byte is in row two, it will be rotated 1 space to the left during encryption, and 1 space to the right/3 spaces to the left during decryption.

Finally, for the subBytes() step, we apply the *inverse* of the sbox() operation. Remember that we usually use a lookup table for sbox(). We can also use a lookup table for the inverse, which looks like this:

One last thing: we need to remember that not all of the rounds are the same. Because we perform just addRoundKey() during the first round of encryption, we do the same for the final round of decryption. Because we omit mixColumns() from the final round of encryption, we must do so for the first round of decryption.

The pseudocode for the entire decryption process looks like this:

function AESdecrypt(ciphertext, key) {

blocks := divideIntoBlocks(ciphertext);

roundKeys = getRoundKeys(key) for (block in blocks) { //first round

addRoundKey(roundKeys[numRounds - 1], block);

shiftRowsInv(block);

subBytesInv(block); //intermediate rounds

for (8, 10 or 12 rounds) {

addRoundKey(roundKeys[..], block);

mixColumnsInv(block);

shiftRowsInv(block);

subBytesInv(block);

} //last round

addRoundKey(roundKeys[0], block);} plaintext := reassemble(blocks);

return plaintext;}

# SafeHouse

We’re done! We’ve explained basically everything. With this knowledge you should be able to implement AES yourself, although we highly discourage it.

We at SafeHouse believe that cybersecurity is for everyone, so we’re glad to present info like this in an accessible manner. The cybersec industry is ignoring small businesses and we want to take a stand.

If you found what you read informative, consider checking us out at https://safehouse.dev/.