Optimizely’s Content-Security-Policy Journey

Ola Nordstrom
Engineers @ Optimizely
9 min readAug 31, 2017

Our main application, https://app.optimizely.com, is now protected by a Content Security Policy (CSP). With this change, our users are protected from cross-site scripting (XSS) attacks, which OWASP calls “the most prevalent web application security flaw”; for an in-depth overview, check out Mike West’s An Introduction to CSP. It took a little over a year from when we first deployed CSP in Report-Only mode to when we were able to enforce the policy. If you are considering deploying a CSP on your piece of the web we hope that this post provides some useful information to help you plan your deployment.

Before getting into details, a bit of context: our application relies on JavaScript heavily. It loads third-party sites in a WYSIWYG editor that allows non-technical users to create variations of their sites. In addition to that, there are dozens of other workflows to facilitate web experimentation for non-technical and technical users alike. Stopping cross-site scripting and content injection is of utmost importance to us.

Reporting

The first necessity when setting up a CSP for a non-trivial application is to configure reporting. Reporting is crucial and must be enabled because it allows you to see CSP failures reported by browsers as they interact with your site. Without reporting it is considerably more difficult to trace broken functionality back to a CSP.

An important feature related to reporting is the Report-Only feature of CSP. In this mode the CSP header, normally set as Content-Security-Policy is changed to Content-Security-Policy-Report-Only and this header instructs the browser to report on policy violations but not actually enforce them.

When the browser does encounter a policy violation it creates a violation report and encodes it in JSON. In the example error report below, copied from the CSP Level 2 specification, an image load from evil.example.com was attempted and blocked due to a default-src directive (which extends to img-src, if no img-src directive is specified).

{
“csp-report”: {
“document-uri”: “http://example.org/page.html",
“referrer”: “http://evil.example.com/haxor.html",
“blocked-uri”: “http://evil.example.com/image.png",
“violated-directive”: “default-src ‘self’”,
“effective-directive”: “img-src”,
“original-policy”: “default-src ‘self’; report-uri http://example.org/csp-report.cgi"
}
}

The CSP W3C documentation describes defining a report-uri that will receive these policy violation reports when the websites defined policy is violated. The latest iteration of the CSP specification defines a report-to directive that is intended to replace report-uri. The report-to format is defined in the Reporting API, currently a W3C draft, intended to define a generic error reporting framework for CSP, network errors, and other forthcoming kinds of error conditions. I’ve yet to come across another web property that uses report-to. However, like many standards it will take a while to make its way onto the web — especially since CSP level 3 and the Reporting API are still in draft status.

We wrote our own CSP reporting backend since there were no available solutions that met our needs. This backend ingests JSON payloads like the above, but with some additional metadata:

  • UserAgent
  • Path
  • Browser Name
  • Browser Version
  • OS
  • Timestamp

It is possible to “smuggle” in even more information from the reporting browser by adding fields to the report-uri query string as suggested by Twitter in their report collector design, but we did not find this necessary. The most important information to us was knowing which path, browser and OS experienced the policy violation.

Note that there is now a startup, www.templarbit.com, that provides CSP reporting backends so you may not have to write your own.

Level Two

Originally we deployed a CSP Level 2 compliant policy. This policy whitelisted every approved origin via the following directives:

  • script-src
  • style-src
  • img-src
  • font-src

This policy was difficult to keep up to date because the set of approved origins may evolve with any code change, and we deploy daily.

With a policy based on whitelisted origins, we were constantly playing catch-up with changes the engineering team had made. Indeed, not only is a whitelist policy cumbersome to maintain, but it turns out it is fiendishly difficult to write a secure one. Why is that?

Enter strict-dynamic

In late 2016 Research at Google published CSP Is Dead, Long Live CSP! On the Insecurity of Whitelists and the Future of Content Security Policy. In it, the authors reach a dire conclusion regarding whitelists CSPs:

In total, we find that 94.68% of policies that attempt to limit script execution are ineffective, and that 99.34% of hosts with CSP use policies that offer no benefit against XSS.

After analyzing 26,011 unique CSP whitelists it goes on to state:

Based on the results of our study, we conclude that maintaining a secure whitelist for a complex application is infeasible in practice; hence, we propose changes to the way CSP is used.

The key result of the Google Researchers was that their proposed strict-dynamic keyword could enable simpler CSPs for the majority of websites.

Since their introduction in CSP Level 2, cryptographic nonces have offered an intriguing alternative to whitelisting individual origins: a given <script> element’s contents are authorized to execute if the element includes an attribute nonce whose value matches a nonce given in the HTTP response that served the page. If the nonces do not match (or are not both present) the script will not execute.

The biggest problem with the CSP Level 2 nonce approach is that dynamically generated scripts added at runtime would fail to execute. All such scripts would have to be refactored and moved off into defined external scripts so that the nonce could be included during page creation.

For sites with a large JavaScript codebase this is a huge, blocking, barrier to adoption. CSP Level 3 solves this with the strict-dynamic keyword. This keyword makes it so that dynamically generated scripts inherit the nonce from the trusted script that created it. Attackers that do find an injection vulnerability will have their injected script blocked since they do know the correct nonce. The strict-nonce substantially reduces the amount of JavaScript refactoring necessary and enables sites with a large existing codebase, such as ours, to adopt CSP.

Refactoring

Adding the CSP header and adding a nonce to all <script> tags is straightforward. Simply add <script nonce=”0123456789”> where 0123456789 is the random nonce that corresponds to a nonce in the CSP HTTP header. The difficulty lies in tracking down features that are not often used and lack full test automation coverage. This is where the reporting infrastructure is crucial.

Line number one

One of the most frustrating CSP violations to track down are the ones where the reported script error is reported as /SomeResource:1 or set to self.

The above CSP violation stems from a clickable element that doesn’t properly respond when clicked. However clicking on the error source, 8519510174:1, in the browser simply jumps us to the top of the page’s HTML in the browser console — this is not particularly helpful. Seeing this error in the backend via the CSP error reporting mechanism is equally frustrating since it isn’t possible to deduce what caused the error.

All you have to go on is that something, some workflow, on that page triggered an error.

For some issues document-uri is set to “data:text/html,chromewebdata” in Chrome. How helpful.

Other errors, when finally tracked down, were attributed to older jQuery versions which can give some very opaque errors.

Each new warning must be investigated and reproduced. Only then can one determine whether it is a false positive or not. There will also be issues caused by client side software which cannot be fixed but have to be accepted. In Desktop environments it is common for both end-user and corporate managed environments to have software that inspect program and network behavior. That can impact browsers and their interaction with your site. For example:

  • blocked-uri = https://gc.kis.v2.scr.kaspersky-labs.com/6BBC944A-119C-4842-B955-11A953814FFE/main.js. This means that Kaspersky, a provider of desktop anti-virus software, is injecting their own Javascript on all pages some user is browsing to. Even though we deliver our site exclusively over HTTPS we have no control over what client side software may be intercepting and modifying our HTTP responses before they are ultimately consumed by an end user’s browser. Cases like thiswhere client side software rewrites our page — can only be rectified on the client which is something we do not control.
  • source-file = “safari-extension://com.lastpass.lpsafariextension-n24rep3bmn”. This highlights another issue: browser plugins can generate warnings. In this case the LastPass Safari extension. Similar to local software that inspect and sometimes rewrite our pages, some browser plugins perform similar actions.

There is a GitHub page where the maintainer collects strange CSP warnings. It has proven helpful to cross reference reported issues.

Testing

Optimizely relies on automated tests that are executed with every code change to prevent regressions. To prevent front-end changes from causing regressions we rely on BrowserStack to execute our front-end test suite and report errors back to our builds. Older version of the selenium drivers used by BrowserStack and similar testing vendors disabled CSP all together; fortunately, the newest Chrome selenium drivers support CSP. You can adapt the following test to determine whether or not a given Selenium driver supports CSP:

https://github.com/ooola/selenium-csp

Unfortunately the browser test automation tool vendors do a poor job documentation which HTTP security features they disable or tweak.

Third-party JavaScript

If your site contains third party JavaScript, as we do, you may need to work with them to ensure their scripts are compatible with your policy. Specifically, their inline event handlers will have to be refactored if they are not written in a CSP nonce-compatible way. For additional information check out csp.withgoogle.com.

Our Policy

Our Content-Security-Policy header is as follows:

script-src http: https: ‘self’ ‘unsafe-inline’ ‘unsafe-eval’ ‘strict-dynamic’ ‘nonce-I8yxR7k/AuBxOXbiMJSF7g==’; img-src data: http: https:; frame-ancestors https://app.optimizely.com http://localhost:8000 https://app.experimentengine.com https://app-staging.experimentengine.com https://demo.experimentengine.com https://teams.optimizely.com; plugin-types application/x-shockwave-flash application/pdf; object-src https://app.optimizely.com/static/includes/swf/ZeroClipboard.swf; base-uri ‘self’; report-uri https://cspreporter.optimizely.com/report/999f2de5-b04d-4544-95c6-39705e57da35;

Plugging this policy into the Google’s CSP evaluator will highlight the ‘unsafe-eval’ keyword, which is something our site editing tools require. Likewise it will disapprove of loading ZeroClipboard which is necessary to allow copy/paste between the Optimizely web application and a user’s desktop. The frame ancestors allow our site to be framed by our recent acquisition, Experiment Engine.

This policy is very similar to other CSP Level 3-based policies that rely on the strict-dynamic keyword. For another example, take a look at the CSP header set by console.cloud.google.com.

Rolling it out

Implementing a CSP Level 3-compatible policy is straightforward, even one that covers a large JavaScript application that utilizes less-commonly used parts of the language and web APIs. Investigating the reported issues and rewriting the affected JavaScript to be policy compatible can be done one issue at a time. For us it took the first half of this year to do so.

Initially, we enabled CSP for authenticated Optimizely employees as a precautionary measure so that no customers would be affected. As of a few weeks ago we enabled the policy and removed report-only from the CSP HTTP header for all users.

Conclusion

Preventing content injection and XSS becomes ever more important as more critical business functionality moves to the web. If you are considering looking at enabling a CSP on your web application we strongly encourage you to start experimenting with a simple report-only, Level 3 compatible policy. With report-only you can fix the policy violations as you see them, on your own time. Before too long you too will be in a position to enforce your policy and make your web app orders of magnitude less susceptible to XSS.

Further Reading

If you have time, I recommend reading CSP is Dead, Long Live CSP — it is well worth it!

We’re hiring!

-Ola

--

--