Malformed cookie handling with document.cookie — who’s responsible?
For the last 6 months we have been experiencing strange bug reports from our users reporting that ListenOnRepeat was fundamentally broken (no site functionality would work) or that login requests were failing and they were permanently stuck in an “anonymous user” state.
Even stranger, these bug reports appeared to be appearing almost arbitrarily, were completely unreproducible and were not triggering errors anywhere in our stack.
Last night, our engineers identified the root cause(s) of the issue — malformed cookies being set by a third-party advertiser who bought inventory from our website. While advertisers clearly should not be setting malformed cookies (we’re working with them currently to address the issue), this problem points to a far bigger conceptual concern…who is responsible for handling malformed cookies?
Although there are various cookie ‘standards’ implemented over the years, RFC 6265 prohibits double-quotes from cookie values unless the entire value is wrapped in a double-quote.
The Mozilla standards for Set-Cookie headers similarly prohibit double quotes:
<cookie-value>can optionally be set in double quotes and any US-ASCII characters excluding CTLs, whitespace, double quotes, comma, semicolon, and backslash are allowed.
However, despite these (adhered to) standards for Set-Cookie headers, the standards for validity in document.cookie are not as strict, and do not prohibit double-quotes, with the following being entirely valid:
document.cookie = ‘a=”b”c’
Resulting in the correspondingly ‘invalid’ cookie being set by the browser and added to any future HTTP requests. In our case, a similar cookie was being set by an advertiser due to an oversight in their JS syntax, allowing a malformed cookie to be appended to every browser request.
This resulted in the cookie being stored in the following way:
a=b; c=[MALFORMED]; userID=[valid];
a=b; c=[MALFORMED]; csrf_token=[valid];
Further complicating the matter, as our front-end is powered by Python-Django, these cookies are then handled by the SimpleCookie() class in cookie.py. In Python, malformed cookies are handled in an unusual way due to a security fix where the cookie parser essentially ignores everything after the malformed token and returns a valid cookie up to the malformed value pair without throwing an error.
>>> from Cookie import SimpleCookie
>>> dd = SimpleCookie()
>>> dd.load(‘a=b; c=”d”e; f=g;’)
In our case, using the above examples, this means Python processes the malformed cookies as a valid cookie containing just:
Resulting in our CSRF tokens being stripped from AJAX requests, or the signed userID cookie being omitted, kicking an authorised user back to anonymous mode — with no errors being thrown.
It is clear that advertisers (or any third-party script) should not be setting malformed cookies. However, in a modern world of web development we feel it is also unreasonable for developers to be expected to vet every single third-party script version revision — and in the case of advertiser scripts, inherently impossible for the majority of small-to-medium ad-reliant startups to manage.
As third-party scripts and advertisers are commonplace, we believe there should be some expectation for the underlying frameworks at some point to handle malformed cookies without requiring a custom-written cookie parser.
As far as we’re aware, this issue will affect any service utilising any Python-based web framework without custom-built cookie parsers. At the time of writing, we have not investigated other web frameworks to see if malformed cookies are handled in a similar fashion but there may well be similar issues affecting other frameworks.
We will update this post once we have determined our own fix for the bug (most likely in the form of a custom cookie parser for Django) and we welcome any discussion on how to better mitigate this issue for the huge number of other potentially affected products.