The case of the broken HTTP header…

Last week the monitoring on one of our clients’ staging sites reported that the site had suddenly stopped working during some last-minute testing before going into production. This site communicates with a remote API that sits behind a large international proxy (think: CDN, WAF etc). The client was seeing all their requests failing via the front-end, but the visual error message they were seeing was somewhat misleading and pointed to a data integrity issue — as an example, sign-in user data not being recognised when the user definitely existed.

The site is fairly legacy and is built in ASP.NET from some years ago, therefore we proceeded with caution into debugging bearing in mind that things may have evolved since then. After a period we narrowed the problem down to a slightly broken HTTP response from the remote API, and it looked in the debugger as if it was line endings that were the issue. Think something like this:

HTTP/1.1 200 OK\r\n
Content-Type: application/json\r\n
X-Some-Custom-Header: some data here\n
Content-Length: 123\r\n

Ignoring the header keys (for the moment!), note the difference in line ending? The debugging pointed towards this being the cause of the breakage as we hadn’t experienced it previously, and indeed adding this snippet to the web.config file…

<configuration> 
<system.net>
<settings>
<httpWebRequest useUnsafeHeaderParsing=”true” />
</settings>
</system.net>
</configuration>

(detailed nicely here)…resolved the problem albeit temporarily — not entirely happy with that solution but time pressures dictated otherwise! The visual feedback was as a result of the response not being able to be parsed, thus no successful user authentication and hence the “user data is incorrect” message. Unsurprisingly, better exception logging has been added to the backlog.

Fast forward a week later, and a companion microservice — node.js running on AWS Lambda — is reported as failing now that testing has commenced. A pretty obscure error of HPE_INVALID_HEADER_TOKEN from deep in the node.js net stack, which manifested itself as a ParseError exception from request-promise. Hold on, we’ve seen something like this before…

After further debugging, it turns out that the line ending above wasn’t the only issue. Consider the following (actual) HTTP response:

HTTP/1.1 200 OK\r\n
Content-Type: application/json\r\n
#['client-correlation-id']: some data here\n
Content-Length: 123\r\n

You guessed it — the middle header was the issue. Turns out that the [ and ] characters aren’t valid in the header field name according to RFC7230 — everything else in that field name however is permitted. This header was being added somewhere in between the remote API and the returned response (we’re presuming at the upstream proxy end but others are also investigating), and was causing parsing errors with some libraries that tried to consume it.

For the Lambda microservice we briefly looked at adding http-parser-js which is a bit less strict than the default parser, but decided to wait until the header is amended/removed rather than putting in workarounds. Once this has happened, we can back out the ASP.NET change as well, and we’ll be good to go…