Every time we get a new project from an external team, we try to analyse potential security vulnerabilities. There are many commonly known issues like injections, XSS (Cross-site Scripting) and XXE (XML External Entity Processing). Although the issue I am about to describe is a type of an injection attack (which is the most popular attack in summaries like the OWASP Top 10 document), it seems to be overlooked often.
HTTP request is a very simple concept on the surface. You specify an URI, add query parameters and/or body and make the request. Unfortunately, the RFC which defines it does not elaborate on the edge cases. Do you know what what will happen if you pass two parameters with the same key? None of the documents tell that. This, if conjuncted with subpar input validation in your backend - can lead to weird, non-intuitive behaviour and - even vulnerabilities. It was identified for the first time on OWASP Appsec 2009 as HTTP Parameter Pollution.
HTTP Parameter Pollution
As the OWASP stated in their amazing presentation:
HPP attacks can be defined as the feasibility to override or add HTTP GET/POST parameters by injecting query string delimiters
Adding additional GET / POST parameters can result in performing an action that user was not willing to do. As the parameters can be carried on by the frontend framework between different routes, one place of injection might result with vulnerability in completely different module.
Overriding parameters can be even more dangerous. Because of the different behavior in different languages and frameworks, duplicated parameter can be overridden, put in the array or malformed. This can even lead to bypassing WAF rules and breaking the validation mechanisms.
Adding parameters in Yahoo Mail
In 2009 Yahoo Mail had a serious bug related to HPP. Because Yahoo’s API allowed firing destructive actions through the use of GET query parameters, the attacker was able to generate a link which, after being clicked, deleted all of victims emails and cleared his trash. The described attack with a demo video is available here.
Yahoo did a lot for security — CSRF, the works — and HPP still got them!
HTTP Parameter Pollution seems even more dangerous when you realize that all communication between web services nowadays is being done via HTTP quite often. Tutorials like Microsoft’s Communication in a microservice architecture encourages the use of HTTP protocol for internal communication between the services. Also, with a raise of IaaS solutions (like S3 for example) which can be managed using HTTP protocol, the consequences can be staggering. This can lead to very serious problems, including personal data leaks.
To prove this thesis, I built a very simple application. It consists of a RESTful API authorized by a token and an identity server which will be hidden behind a private network. The API has just one endpoint GET /me returning user’s data (including personal information). This endpoint accepts a token and, optionally, a list of fields we’d like to include in the response. The architecture has been kept simple to focus just on the security issue. The database is seeded with two users on the identity server but only one of them has a valid token to access the API server, so technically we shouldn’t be able to get second user’s data by any means.
Below you can see API backend implementation:
And here is the identity server:
Let’s think for a moment how can we break our solution. Our goal is to get data of a different user. The identity which we’d like to obtain is specified by id parameter to our identity server, which we don’t control. Instead, we can try to manipulate the fields parameter.
Let’s consider the following request:
Notice the weird fields argument. We are injecting encoded “first_name&id=2” string which, in the request to the authentication server will be treated as as a part of the URL:
Depending on the server technology, the id parameter will differ. Some frameworks will pass id=2. Some frameworks will pass id=1. Some will pass array of values or something bizarre (with my favorite 2~~1 in DBMan). It is completely up to the framework/technology you are using as the official RFC1738 which defined URL schemas is not being very elaborate when it comes to searchstrings (Section 3.3). This situation confirms OWASP which qualified this vulnerability under the codename OTG-INPVAL-004. They point out two additional RFC documents (RFC 3986 and RFC 2396) which defined query strings but do not provide any answer how to handle repeated values. As the result, some servers will:
- include just the first occurrence (IBM HTTP Server, mod_perl on Apache, mod_wsgi on Apache),
- include just the last occurrence (PHP on Apache)
- include all occurances as comma separated string (ASP and ASP.NET on IIS)
- include all occurances in a language-specific list type (Zope or as it turns out from my research — Express.js as well).
What to do?
Unfortunately, the information on the topic is quite sparse. Many resources are outdated and many original materials are long gone. Using the web archive I was able to access the first paper on this topic from NDSS 2010. I highly recommend reading it if you are interested in the topic for more details.
From my experience, there are several techniques which can improve security related to this topic:
- Validate everything
No matter if the data comes from the user, from your internal microservice or database, always validate it. You can never be sure if the data hadn’t been polluted. Also, never build query parameters and any validated strings by hand. In this example, we could easily use params option from their API (described in the very first example in their documentation). And, in general — do not recreate validators by yourself. Many standards which seems easy are much more complicated under the surface and well established validators takes it into the account.
- Log everything
In this scenario, it is most likely that this hole would stay in the app for months or even years. The only place where you could spot it would be probably the logs of the API server or identity server. It is important to have long-lasting log archive to be able to estimate the range of the breach if it happens.
Include such examples in your test scenarios or run fuzzy tests. You can use OWASP ZAP HPP Passive/Active Scanners which you can plug into your CI solution easily.
- Always consider the environment you are in
As shown in this example, sometimes environment makes certain type of bugs easier to exploit.
- Always question your solutions
Always try to think outside the box and try to break your application in as many ways as you can come up with. There are many techniques to automate the process but human creativity is invaluable.
If you want to try the example by yourself, the source code is available here.