Lessons Learned Deploying a Generic CSRF Solution

Joe Rozner
5 min readAug 22, 2018

--

The summer of 2017 culminated the substantial research and development effort of a generic solution to CSRF that could be easily applied to most web applications after they have been deployed without requiring changes to the codebase. The monster of a blog post that goes through the design and implementation is available here in excruciating detail. Now, roughly a year later (at time of writing), we’ve had the opportunity to deploy this solution to a large number of web properties, across many organizations, many web technologies, and architectures where we’ve been able to see how the assumptions made and design chosen have held up.

So far it’s been pretty good. It hasn’t been without issue but overall the strategy has held up well and been pretty effective. The following are some of the lessons we’ve learned in maintaining and deploying this strategy.

The JavaScript payload originally proposed has undergone a number of changes since the initial discussion though its functionally is still pretty much the same. Almost immediately we learned that we had missed some elements that developers use to submit forms. The most interesting was the unknown to me until recently input type, “image”, which allows specifying an image that functions as a submit button. Luckily most web properties we’ve deployed to have been pretty sane with respect to using arbitrary elements as buttons via JavaScript but it’s not unlikely we’ll see in the future additional tags that must be supported; especially with the addition of custom elements added to HTML 5.

Despite the fairly sane use of HTML elements to submit forms one thing we have noticed is a large amount of markup that breaks specification and convention. One issue repeatedly seen is elements to submit forms existing as siblings to the forms in the DOM rather than children. This is clearly a violation of the specification and only works because they have used JavaScript to force the form submission.

The original implementation of the client side JavaScript bound an event listener to document for “click”. It then checked if the clicked element was child of or was itself a known submission element. If so, it continued to walk up the DOM from there looking for the containing form and inserted/updated the token into the specific form containing the submit element before allowing the form to submit. This behavior broke with submission elements as siblings and a change had to be made to instead update all forms in the DOM because of the lack of a reliable way to only update the intended form.

In addition to the page structure issues no discussion of JavaScript flaws would be complete without mentioning Internet Explorer compatibility. Internet Explorer provides a compatibility mode that allows newer versions to masquerade as older ones. This is used a primarily when web applications are designed to work on a certain version and may or may not be written with cross platform or standards compliant JavaScript. It was suggested in the original blog post that User Agent strings could be used to disable the protection for older versions of Internet Explorer not supported (pre-8). While this solution still stands it’s not quite as simple as originally designed. In compatibility mode Internet Explorer will always specify its version to be Internet Explorer 7 according to the MSIE value. This mean’s they’ll trigger the failure case of an unsupported browser despite being fully functional in many if not all cases. Luckily the User Agent string is nice enough to throw in the Trident version which can be mapped to the real version of Internet Explorer that is being run. Any code using the original suggestion should take this into account.

Lastly for Internet Explorer there was one huge oversight around the compatibility of one of the APIs used which follows directly to the rest of the discussion. In the cross origin validation code the URL object was used to parse the request url. This API is not supported by Internet Explorer but the anchor tag URL parsing hack can be used instead to pull out the necessary information.

The rest of the issues deal primarily with cross origin requests and origin verification. First and foremost the originally described way to generate the expected origin had a mistake in it. My original suggestion was to simply take the Host header and use that. This misses one key piece of data which is that the origin also contains the protocol meaning that this would allow for mixed content to be incorrectly validated. The host header is still sufficient to get the domain and port but the protocol must be identified through some mechanism to derive the actual origin. There are a handful of ways to do this, one being that the framework or language may just provide it via its API and another is looking for the “X-Forwarded-Proto” header (or one of the other non-standard ones) if it is set. There are likely many other mechanisms possible but this covers most situations.

The original blog post discussed cross origin verification as both a complete solution for CSRF in some situations, multi-origin website interaction that essentially is intended CSRF, and CORS as driving factors for performing origin verification in the solution. One deployment scenario we’ve seen is the presence of a static web server for serving up a landing page where JavaScript is used to render everything on it via additional requests. This results in the first request to an application essentially being a POST (or other unsafe) request which results in a failure because the JavaScript isn’t being injected and an initial token isn’t being seeded. There are solutions to this, including adding the JavaScript manually or throwing both behind the same load balancer/reverse proxy and integrating there. Sharing keys and rotating for the token generation and validation is a logistics problem that needs to be solved for both CORS and standard multi-origin applications. These mostly depend on your deployment infrastructure and automation and are not strictly limitations of the design but implementation details that need to be solved for this sort of deployment.

Ultimately the conclusion I’ve come to is that supporting cross origin requests and verification of origin is hard. While it does exist, CORS is really not all that common in the wild; at least for where we have been deployed. This has led me to question of necessity of origin verification in a generic solution. Origin verification is definitely a valid and in many situations can be a complete solution to CSRF as explained in the original blog post. The functionality was added in for completeness sake but tokens do a good job already and can be less error prone due to the failure conditions where origin verification can’t be performed (missing origin header, missing referer, etc.) When used correctly, origin verification is a good solution but for a generic solution sticking to just token (and eventually SameSite) might be sufficient and easier to get right and deploy. In the end this comes down to whether or not you’re deploying into a multi-origin environment. If so, some support is likely necessary, but if not it is simply a risk acceptance question whether your willing to drop one of the layers of protection trusting the others to hold.

--

--

Joe Rozner

Startups, infosec, snowboarding. I build shit in Los Angeles