What is httpoxy?
An explanation for non-technical audiences
Hi! I’ve been telling the open source community about a security vulnerability that was recently rediscovered lurking in a bunch of software. This is an attempt at a simple explanation of the problem, for people who don’t write or deploy web applications.
If you’re looking for technical details, and mitigation instructions, you can head to httpoxy.org. And if you’re looking for a story about the discovery and disclosure process, you can check out my other Medium story on httpoxy.
Also, I’d like to point out that, because we found this, we were able to prevent it from ever affecting Vend or the retailers who use us (perhaps the best advantage of being the original reporter!) Phew, right? OK, on with the story.
Web applications like to make web requests
The first thing I should probably explain is why a web application would make a web request.
Normally, you send your own request to a web application, it processes the data you give it, and it sends you back a response containing the page you asked for. (When all goes well!)
Sometimes, a web application will want to ask other software what to do. It’ll do so using an “API”; a piece of software that tells the application on the server how to talk to other applications. In the past few decades, we’ve seen the rise of APIs that are based on the same technology as the web itself; this is really nice for web application programmers, because it means servers only need to deal with responding in a single way (i.e. via HTTP).
So, these days, when you make your request to an application, the server might decide to go send its own request off to somewhere else. In a “microservice architecture”, a fashionable way to structure modern apps, this can happen quite often; sometimes even multiple times on every request.
Web applications like environment variables
The next thing to cover is what “environment variables” are (you’ll often see these called “env vars”).
Environment variables are a facility for storing simple fragments of text alongside the application. They’re mostly used for configuration because:
- Environment variables are really easy to use, and almost everything supports them, from software that will help test your application to complete deployment solutions that will ship your code to production
- They let you keep important configuration data separate from the actual code of your application (for various Good Reasons, web developers try to keep these two separate; a big one is it lets them be much more flexible with how they deploy their app.)
So, the important thing for our purposes is that env vars are usually set by the person deploying or developing the application.
There’s an env var that configures a proxy
One environment variable in particular affects the way outgoing web requests are made; that’s HTTP_PROXY (and sometimes http_proxy; a lower-case version of the same thing).
A proxy is a server that receives web requests from users, and passes them on to the actual application they’re trying to reach. You’ll often find proxies in corporate environments, where the IT team would like to be able to monitor the use of the web in their company. You’ll also find them used for performance reasons.
HTTP_PROXY is an environment variable that usually has a internet hostname in it; web clients see that, and auto-configure it as the proxy to use.
And, there are requests that can set environment variables
In the 1990s, a common way of deploying a web application was known as CGI: the common gateway interface. This is just a set of rules for developers to follow. By doing so, you could deploy a web application written in any language behind the same web server. Useful!
Part of CGI specifies that a whole heap of useful information should be passed to the application in environment variables.
So, specifically, when you send a request to a web app, your browser includes some metadata in addition to the URL you’re hoping to visit — these are called request headers. They include things like the (human) language you’re looking for, and the types of content your browser can handle. All these things are ultimately passed to the application using environment variables.
When they are, CGI prefixes the name of each header with HTTP_, to show it came from an HTTP request.
httpoxy is just a combination of all of the above
So, to describe a typical attack:
- The attacker makes a specially crafted request, that includes a special request header, called Proxy
- CGI picks this that header, and turns it into an environment variable called HTTP_PROXY
- The web application notices the HTTP_PROXY env var, and configures its web client to use it as a proxy
- The web application makes a request using its client, and that request is sent to wherever the attacker wants, instead of the API the programmer wanted
I hope that was a lot easier to follow than the jargon-filled technical disclosure documents we’ve published for web application developers today.
But what’s the damage?
That’s the bit that depends on the specific application that’s affected.
For example, the web application might make a request to a weather API; you give it a postcode, and it gives you back a five day forecast. In this case, the attacker could tweak the output of that API, and make it seem like it’s going to rain for the next few days straight.
So far, pretty innocuous right?
But that request to the weather API will likely include credentials that the application is using to identify itself to the weather forecasting service. So, that’s now something the attacker has.
And, thinking of more serious scenarios, the exact same thing could happen with something like a billing backend. The web application might ask “has user X paid this month?”, and then the attacker might be able to make it say “yes, that user paid for a premium account” instead of “nope, that user is in arrears”!
Mitigating factors — the good news
There are a few things that make httpoxy much less of an issue than it might otherwise be:
- A lot of APIs require an encrypted (SSL/TLS/HTTPS) connection these days, and those requests cannot be attacked in the same way — httpoxy only affects unencrypted requests made by the server
- httpoxy is very easy to mitigate against; you just stop the Proxy header from reaching the app altogether, and the problem is solved.
- CGI is much less widely deployed than it used to be; we have faster and better ways to serve web applications these days, and those tend to be used instead.
So, httpoxy is embarrassingly simple to exploit, and you can do so remotely, and it can have fairly bad consequences. But, it’s also very easy to fix. We expect anyone affected should be able to fix their web application within a very short timeframe.
What do I do?
If you’re a web application developer, or you look after a deployed site, you should check if you’re affected at httpoxy.org, and apply the fast, easy mitigation as soon as you can.
But if you’re an end-user of a web application, or just reading along for fun, then there’s not really anything for you to do other than enjoy the schadenfreude.