Hacking the Same-Origin Policy

How attackers bypass the fundamental Internet safeguard to read confidential data.

Vickie Li
The Startup
6 min readNov 19, 2019

--

Photo by Katya Austin on Unsplash

The Same-Origin Policy is one of the fundamental defenses deployed in modern web applications. It restricts how a script from one origin can interact with the resources of a different origin. It is critical in preventing a number of common web vulnerabilities.

However, since the SOP is quite strict and inflexible, most websites utilize methods of relaxing the SOP. And this is often where things go wrong. Today, we are going to dive into the details of SOP, how it impacts web applications, and how attackers exploit features that relax SOP.

What is the Same-Origin Policy (SOP)?

In one sentence, the Same-Origin Policy is this: A script from page A can only access data from page B if they are of the same origin.

Who has the same origin?

Two URLs are said to have the same origin if they share the same protocol, hostname and port number. Let’s say that page A is at

https://medium.com/@vickieli (HTTPS is on port 443 by default)

Which of these following pages would be of the “same-origin” according to the Same-Origin Policy?

https://medium.com/ (same origin, same protocol, hostname and port number)http://medium.com/ (different origin, because protocol differs)https://twitter.com/@vickieli (different origin, because hostname differs)https://medium.com:8080/@vickieli (different origin, because port number differs)

What does SOP limit?

The SOP does not allow a script from page A to access data on page B if they don’t share the same origin. This is meant to prevent a malicious script on page A from obtaining sensitive information embedded on page B’s DOM.

  • Side note: SOP limits data access only. Embedded resources such as images, CSS and scripts are not restricted and can be accessed and loaded across different origins.

Modern web applications often base their authentication on HTTP cookies, and servers take action based on the cookies included automatically by the browser. This makes SOP especially important.

Imagine if you are logged in into onlinebank.com, and at the same time, you are visiting attacker.com in the same browser. If SOP doesn’t exist, a script hosted on attacker.com is free to access your information on onlinebank.com, since your browser would automatically include your onlinebank.com cookies in every request you send to onlinebank.com (Even if the request is a malicious one generated from a script hosted on attacker.com).

attacker.com could do something like this:

1. Issue a GET request to onlinebank.com/personal_info using a script. (Since you are logged in to onlinebank.com, the server could send back the HTML page containing your personal info page.)2. Receive and parse the returned HTML page.3. Retrieve the CSRF tokens, private email addresses, addresses and banking information parsed from the page.

This is where SOP comes into play: SOP will prevent the malicious script hosted on attacker.com to read the HTML data returned from onlinebank.com.

Relaxing the SOP

Practically, SOP is often too restrictive for modern web applications. For example, multiple subdomains or multiple domains of the same large website would not be able to share information with each other. To work around these issues, many ways of relaxing or working around the SOP was invented.

Setting document.domain

Setting the domain of different subdomains to the same using document.domain will enable them to share resources. For example, setting the domain of a.domain.com and b.domain.com to domain.com so that they can interact.

  • Side note: Doing this will set the port number to null, which might be interpreted differently by different browsers. In the above example, https://a.domain.com might not be able to interact with https://domain.com since their port numbers differ (Null versus 443).

Cross-origin resource sharing (CORS)

You can also use Cross-origin resource sharing (CORS) to relax the SOP. CORS protects the data of the requested server. It allows servers to explicitly specify the list of origins that are allowed via the Access-Control-Allow-Origin header. The origin of the page sending the request is then checked against this list of allowed origins.

Cross-domain messaging (postMessage)

PostMessage is a way of working around SOP. This technique allows pages to send text-based messages to other pages by calling the postMessage() method on a window. The receiving window then handles the message using an onmessage event handler.

Since using postMessage requires the sender to obtain the window object of the receiver, messages can only be sent between a window and its iframes or popups.

JSON with Padding (JSONP)

JSONP is another technique that works around SOP. It allows the sender to send JSON data within a callback function that gets evaluated as JS. Then a script located at a different origin can read the JSON data by processing the function.

Since the HTML <script> tag is allowed to load JS code regardless of origin, an easy way of sharing data across origins is through loading it as a part of a <script> tag. JSONP wraps JSON data in a callback function in order for the data to be interpreted as valid JS code.

For example, let’s say we’re trying to pass the following JSON blob across origins:

// data located at https://medium.com/get_user_articles
{“username”: “vickieli”, “num_articles”: “39”}

This data block cannot be loaded directly as a script since it’s in JSON format:

<script src=”https://medium.com/get_user_articles”></script>

This would fail since the JSON blob above it’s not valid Javascript, and a JS syntax error would be thrown. JSONP works around this issue by wrapping the data in a JS function:

parse({“username”: “vickieli”, “num_articles”: “39”})

The page that receives the data can then extract the data from the JSONP payload by processing the function.

The issue with JSONP is that site A would have to trust site B completely because it is including arbitrary Javascript from site B. Now that CORS is an option, JSONP is used less often.

Attacking the SOP

Besides the controlled and intended SOP bypasses mentioned above, there are ways that an attacker can use to manipulate cross-origin communication. These exploits are often caused by the faulty implementation of one of the SOP relaxing techniques.

An attacker being able to bypass SOP and the relaxation policy intended by the developers can cause private information to be leaked and often leads to more vulnerabilities such as authentication bypass, account takeover and large data breaches.

Let’s talk about a couple of ways how attackers achieve this and how these techniques work.

XSS

XSS is essentially a full SOP bypass because Javascript that runs on page A would operate under the security context of page A. This means that if an attacker is able to get a malicious script executed on the victim page, the script can access the page’s resources and data.

Exploiting CORS

Misconfigured CORS is another thing that attackers can exploit to mess with cross-origin communication.

One of the exploitable misconfigurations is when a site uses weak regex to validate origins. For example, if the policy only checks if an origin URL starts with www.site.com, that policy can be bypassed by using wildcard subdomains. If the attacker owns the domain attacker.com, she can enable the wildcard entry to her own domain, so that *.attacker.com would be redirected to attacker.com. She can then use www.site.com.attacker.com as the origin of the request to bypass SOP.

Another common misconfiguration of CORS that can be exploited is setting allowed origins to NULL or attacker.com. This basically defeats the purpose of SOP and removes the limitation on any cross-origin communication.

An interesting configuration that is not exploitable is setting the allowed origins to the wildcard “*”. This is not exploitable because CORS does not allow credentials to be sent with these requests, and so no private information could be leaked.

Misconfiguration in CORS is also not exploitable when custom headers are used for authentication, or when there are random, unguessable keys placed in the request or the URI.

Exploiting postMessage

When using postMessage, both the sender and the receiver of the message should verify the origin of the other side. Vulnerabilities happen when pages enforce poor origin check (weak regex, for example), or lack origin checks altogether.

If the sender page does not enforce the targetOrigin of the receiver or uses a wildcard targetOrigin, it becomes possible to leak information to other sites using postMessage. (The targetOrigin can be specified in the postMessage function as a parameter.)

To exploit this issue, an attacker can create a malicious HTML page that listens for events coming from the vulnerable page. She can then trick victims into triggering the postMessage utilizing a malicious link or fake image and make the victim page send data to the attacker’s page.

On the other hand, if the message receiver does not validate where the postMessage is coming from, it becomes possible for attackers to send arbitrary data to the website and trigger unwanted actions on the victim’s behalf.

To do that, the attacker can embed or open the victim page in an HTML page to obtain its window reference. Then, she is free to send arbitrary postMessages to that window. When this malicious HTML page is accessed by the victim, the malicious postMessage will be fired bearing the victim’s credentials.

Thanks for reading.

--

--

Vickie Li
The Startup

Professional investigator of nerdy stuff. Hacks and secures. Creates god awful infographics. https://twitter.com/vickieli7