HTTP Desync Attacks with Python and AWS
Tales of a preoccupied developer
A couple of months ago, I was at work waiting patiently for some documentation to go live about a new type of attacks against modern web applications called HTTP Desync attacks. And there it was! I remember thinking that it would be as huge as hearthbleed (in terms of media coverage), but turns out I was wrong. We barely heard anything from vendors and the community until October. A few days ago, the recorded talk from Defcon was released and it grabbed my interest again. Enough so, that I wondered if my own stack was affected.
Let’s back-off a bit and explain first what those attacks are. For that, we need a bit of HTTP 1.1 history. One problem with the original HTTP specification was that you needed to open a TCP connection for each new request made to the server. As we know, this can become an expensive process if the client requests a lot of resources. With HTTP 1.1, you can send multiple requests over the same TCP connection. This becomes potentially problematic when you introduce a proxy between the server and the clients. Consider the following simplified architecture:
Here, the connection between the ALB and the EC2 server will potentially be reused for different clients. That means that if a malicious client can leave some data in the receive buffer of the server, the next client will have his request modified. It turns out, it can be done by exploiting a combination of two headers: Content-Length and Transfer-Encoding. In a regular application, you will typically either specify a fixed Content-Length when you know the size of the payload or specify a chunked Transfer-Encoding when you don’t (and you want to stream it piece by piece). But what if you send both?
The proxy might decide to use the Transfer-Encoding and the server, the Content-Length or vice versa. In any case, this will cause a desynchronization between the proxy and the server, hence the name of the attack. I won’t go in more details, but you should definitely read the original paper and the follow up blog post by the original author to know all the details. I highly also suggest reading the blog post HAProxy HTTP request smuggling.
What about Python?
If you are reading this blog post, you probably already know that most of the Python code in production runs under the umbrella of a WSGI server. Those include uwsgi, apache (with mod_wsgi), waitress and gunicorn. My stack uses gunicorn, so the rest of the blog post will target this server, but you should verify if your server of choice is also vulnerable.
If you read this post and you are not on gunicorn >19.10 or >20.0.2, please stop reading and go upgrade!
Going back to my story, when I watched the talk from Defcon, I immediately checked on gunicorn’s Github to see if someone had opened an issue for this attack and… I found one 🤩. I got excited and started testing it for myself. I created a basic application and infrastructure based on ECS to conduct my tests. It has only one endpoint that returns the headers of the request in the body.
From there, the first step was to use the excellent Burp plugin HTTP Request Smuggler combined with the plugin Flow to test the application. This gave us a first baseline of potential vulnerabilities.
What you want to hunt for in this step are mostly 502, 404 and 405, luckily we got one. The other ones are not very useful, here is why:
- 200: Even if you sent multiple requests, they were all OK so both the proxy and server were in sync
- 400: The ALB blocked the request as invalid
- 501: The ALB blocked a bad Transfer-Encoding as not implement
The moral here is that the ALB is your ally, it blocks a LOT of stuff and it would have prevented even more attacks if I had simply enabled the setting:
routing.http.drop_invalid_header_fields.enabled. Some (not too security savvy) people complained on twitter when AWS rolled out this fix and unfortunately they decided to make it disabled by default. So make sure it’s enabled in your infrastructure.
Now that we have an attack surface, let’s try to dig a bit deeper. The attack we found is the following:
Notice the space between the end of the TE (transfer-encoding) and the colon. The 405 means that the second time we sent the payload, the final X of the previous request was appended to the next request. This resulted in a method XPOST which does exist, hence the 405. But can we do better?
In this attack, we completely override the next request. Even though the client did a POST, what he did for the server is a GET on /404. Ouch! This is bad 😞. In both cases we exploited a CL-TE attack, because the proxy used the Content-Length while the server used the Transfer-Encoding. This is because even though RFC 7230 states that
no whitespace is allowed between the header field-name and colon, it used to be common for applications to normalize the name of headers.
The fix was simply to remove the rstrip at line 6 (we actually made a setting so if you really need this behaviour, you can still operate, but you will be vulnerable). This is also the fix golang decided to use.
And here I thought life was easy
As I was looking through the code for this first patch, I found an interesting bit of code that I thought could be exploited
Here, we iterate over all headers and we set some values that will then be used to set the reader correctly. What is wrong with it you might ask? Well, what happens if you have a duplicate header? Only the last value is used and that might a cause a desync if your proxy uses the first value!
So first, let’s try to fix the Content-Length. I can’t show you an exploit of this, because the ALB is already protecting us from this attack. Some proxies might not though, so that’s why it’s important that we fix it in gunicorn. According to RFC 7230, you can do two things in case of a duplicate CL:
The recipient MUST either reject the message as invalid or replace the duplicated field-values with a single valid Content-Length field containing that decimal value prior to determining the message body length or forwarding the message.
So the easy and safe fix is to reject the message. Node servers do that and that’s also the new behavior of gunicorn.
The particular case of Transfer-Encoding
TE is a weird header. Reading more about it, you discover that it’s an hop-by-hop header. Meaning that each node can decide to change the TE along the route. You also learn that you can have multiple TE that you should handle in the given order. Looking back at the code, we are quite far from the spec!
In modern days, the most widely used TE is chunked (for large payloads that don’t fit in one frame). Even if other TE exists (mainly compress, deflate and gzip), the ALB do not accept them and returns a 501. The only one that is also accepted is identity, it basically tells the server to do nothing with the payload (useless, I know 😂). But because of it, we can induce a TE-CL attack with a payload like:
This is because the second TE will override the chunked property in gunicorn which will then fallback on the CL to parse the body. The ALB seems to protect us from this attack by not forwarding the CL to the server, but it could potentially work with other proxies. To try to mitigate the attack, we might be tempted to just deny all TE except chunked and process the payload inside the WSGI server, but then we would break compatibility with existing applications and that’s a big NO NO in Python. Further more, even though PEP 3333 says:
WSGI servers must handle any supported inbound “hop-by-hop” headers on their own, such as by decoding any inbound Transfer-Encoding, including chunked encoding if applicable.
The reality is that an unofficial flag (wsgi.input_terminated) exists to tell the WSGI server to transfer chunked data to the application (and vice-versa). This is all a big mess if you want my opinion and I am pretty sure that someone will find more desync attacks due to that feature. Simply because the WSGI server acts like another proxy layer in that scenario.
As for RFC 7230, it says the following:
If any transfer coding other than chunked is applied to a request payload body, the sender MUST apply chunked as the final transfer coding to ensure that the message is properly framed. If any transfer coding other than chunked is applied to a response payload body, the sender MUST either apply chunked as the final transfer coding or terminate the message by closing the connection.
But if you look at the behaviour of ALB, it will accept the TE (in order): chunked, identity. Which is in theory not compliant. So for now, the new behaviour of gunicorn is that if any TE equals to chunked, it will consider the message as chunked even if it’s not the last TE. If you are running behind an ALB, you will be protected. I cannot guaranty the same for other proxies.
Ouff, this has been a tough ride 😅! I hope I didn’t lose you along the way. The key points of this blog post are that the HTTP Desync Attacks are fresh, a lot of people are vulnerable and they are really hard to patch properly. We, as developers, should be more concerned than we are currently about them and we need to understand the impact of our infrastructure’s choices on the security of our applications. As for gunicorn, all the fixes explained above have been merged and released!
I hope you enjoyed your reading and I will see you in a next blog post!