Leaky Images in OAuth

The internet security world is full of all kinds of attacks and information leaks. Even well-designed security protocols are susceptible to such things. Antonio Sanso covered a lot of these in the middle chapters of our book, OAuth 2 in Action, and his advice is spot on and well worth reading. There’s a subtle security issue that we didn’t cover explicitly in the book, but as I’ve seen it crop up in a few implementations lately I wanted to call it out explicitly here.

The Setup

Take a look at this authorization screen. It’s pretty simple, but it’s functional, and in fact it’s one that I wrote for the exercises in OAuth 2 in Action.

A typical OAuth authorization screen

It shows the user all kinds of good and useful information about the transaction that they’re being asked to decide on. It has the client’s display name, a list of scopes the client is asking for, and a clear indicator of the URL the user will go to. And it even has a picture to make it look nice!

The Problem

That picture is exactly where the trouble lies. The picture is based off of a URL provided by the client developer, a URL that is hosted somewhere else. The user’s browser is handed an HTML page and fetches that image from the given URL. That’s all well and good, as it gets the picture where it needs to go, but there’s something built into the HTTP protocol that can leak information.

When the HTML page above is rendered, it includes an img tag with the URL given by the client developer. This tells the browser to go make another HTTP request to grab the image.

Browser making a request for the image embedded in the page

On its own this isn’t too bad, and this is how the web is meant to work: you separate media content like images from textual content like web pages, and serve them through separate requests. However, this can cause a few issues in practice, especially when the URL for the image is sourced from someone else — in this case, the client developer.

First, remember that it’s the user’s browser that’s making the request to an outside party. This tells the outside party that a user is engaging with that resource — in this case, fetching the image embedded in a page indicates that someone is probably interacting with that page. The request logs for the image server can tell the client developer information about the user’s browser, OS, location, network, and other items. For a web-based client, this isn’t anything they don’t know already. For a native client, this is all something that the developer would have never seen otherwise. An attacker could use this information vector to determine which of their phishing attempts led to the users clicking on an email and landing on an authorization page. Even if the users don’t authorize the malicious client, the fact that the image was requested is proof that the authorization page was shown.

Second, the image isn’t necessarily hosted by the client developer themselves. In our example here, the image is hosted from a third party site, not from the client application code. All that information we just said could leak to the developer? Now it’s leaking to the image host.

Third, that image URL might not contain an image at all. A malicious developer (or whoever’s hosting the URL) could inject drive-by malware into the download for a user. At first, this sounds very scary, but isn’t this the same risk people encounter when browsing the web? That’s partially true, but this situation is different. The AS is a site that the user trusts. Items included on a page served by the AS implicitly carry that same trust, even though the URL for the client’s logo was provided by another party, the client developer. Therefore the user’s trust in the AS is conferred erroneously to these items, making users more likely to click through confirmations or ignore warnings. After all, why would their AS betray them?

Fourth, and perhaps most surprisingly to many developers, the browser automatically sends a Referer header with its image request. This is used in HTTP to tell a server how you got to the link you’re now requesting. But OAuth, you’ll recall, uses front-channel communication, and front-channel communication uses query parameters to convey important information. Take a look at the Referer header here for our request above:

Referer header, second from bottom

The Referer URL contains the entire request to the authorization server. The server hosting the image now knows not only the client’s ID, but also what kind of grant type it can do and the scopes it’s asking for. More importantly, it knows the client’s state parameter and potentially other sensitive security components. These could be used to attack the client through a code injection attack or other vectors.

The real problem here is that most developers don’t even notice that this information is being willingly sent around the web.

A Solution

When we realized this was a potential problem, we implemented a fairly simple solution in MITREid Connect: a server-side cache for logo images. Instead of presenting the user with the logo URL and having their browser download it, the server makes the request to the logo URL on the user’s behalf. The user is then served a URL hosted at the server, which serves the cached image data. A few other authorization servers out in the wild do the same thing.

This approach has a handful of advantages. First, all the requests are coming from the server, not the user. This makes it impossible for the image host to gather information directly from the user’s browser. Furthermore, no Referer information is sent because this request is not coming from an embedded webpage. Additionally, the server caches the image data, so that all subsequent requests for the image are served from the cache and cause no additional queries to the image host. With this, developers can no longer infer that their app is being used just by watching the request logs for the image. And finally, the server can act as a watch dog for users by scanning any incoming content for known viruses and malware. In the case of MITREid Connect, we leave such scans as an extension point for developers to hook into their own systems.

In conclusion, with all web-based protocols, make sure you’re fully aware of all of the information you might be leaking or causing your users to leak. The web’s open nature can create some surprising behaviors, and it’s up to us as developers to make sure people don’t succumb to it.