Veil: Private Browsing Semantics Without Browser-side Assistance

Frank Wang
Frankly speaking
Published in
4 min readNov 20, 2018

This is part of a week(-ish) blog series where I deep-dive on a cool technology. I am an investor at Dell Technologies Capital in Silicon Valley, and occasionally reminisce about my previous life in academia. Follow me on Twitter and LinkedIn.

Today, I’m writing about one of my first PhD projects. At the time, surprisingly, very few people cared about privacy. Incognito modes were novel concepts and much less prevalent than they are now. Ever since, there’s been rapid growth in the usage of search engines like DuckDuckGo and anonymous browsers like Tor. Today, every major browser has a version of incognito mode, and in fact, many users browse privately by default. To say the very least, people care a lot more about their privacy.

In private browsing mode, web pages shouldn’t leave identifiable, persistent client-side state. However, research has shown they are very leaky. For example, data can leak from the DNS cache, and databases can be polluted. Current instantiations of private browsing leave RAM artifacts in page swap and hibernation files. As a result, forensic tools can easily recover this data and fingerprint activity.

The problem is that private browsing is hard to implement with only client-side support. Browsers are complex and constantly adding new features, and they lack a priori knowledge of sensitive content. For example, to prevent RAM from paging to disk, the browser has to use mlock() to pin all the memory because it doesn’t know which part of a webpage will contain sensitive information. Even transmission of web content to a user can pollute in-memory and on-disk regions.

What if developers can implement private browsing semantics? The goal is to protect greppable content from a post-session attacker. A post-session attacker is an attacker that logs onto the same computer as a user after her session ends.

Veil is the first platform that allows developers to private browsing semantics to a user. The insight is that web services control 1) the content they deliver and 2) the servers that deliver this content. Below shows an overview of the Veil platform.

Veil architecture overview. A developer compiles a website’s objects, e.g. HTML, CSS, Javascript, and images, using the Veil compiler. This functionally rewrites the webpage. Then, these rewritten objects are uploaded to the blinding servers that store and mutate the content. Finally, when a user visits a webpage, she first goes to the Veil bootstrap page, and the Veil library helps display the page but hides page-specific URLs.

Below, I show how the Veil compiler works. A developer runs the root HTML file through the Veil compiler. The output is a rewritten HTML page with veilFetch instead of the actual object, such as CSS or image. Later, when the page loads, veilFetch (from the Veil library) will dynamically issue an XHR request to obtain the object.

The output of the Veil compiler is stored on the blinding servers. These servers are trusted and can be run by volunteers. These servers are a key source of security and privacy in Veil. They help mutate the content and also help hide the actual source of the object (e.g. hides Javascript source or source of JQuery, etc.).

In order to protect RAM artifacts, we use the following techniques:

  • heap walking: reduces likelihood of swap of objects passed into the markAsSensitive() function
  • content mutation: not leak site-specific content

More details on the specific techniques are available in the paper.

There is also a more secure browsing mode called DOM hiding mode, which tries to hide complex DOM structures. The idea is the user’s browser is just a thin client, and a remote server (e.g. the blinding servers or content provider) loads the real page, displays the screenshot on the browser, applies user GUI events, and returns the screenshot of the updated page.

DOM-hiding mode. The content provider loads the page on a remote server using a headless browser, and the client sees the screenshot of foo.com.
When the user does a GUI event like a click, the event is sent to the remote server, which loads the updated page and sends back the screenshot of the page after the event.

For more technical details on either of the modes and how we handle specific web edge cases, such as websites that have dynamic content, I refer you to the paper.

Below are the page load times for six test web pages with and without Veil. Veil added modest overhead to the page load time.

Page load times with and without Veil

To conclude, traditional private browsing modes still leak data! Veil allows developers to directly improve the privacy semantic of their pages without having to wait for Chrome or Firefox to modify their browsers.

The main motivation for this project was to change who controls trust on the web. Traditionally, web browsers have had the responsibility of dealing with security and privacy. This platform provides a way to shift the burden to developers.

If you have questions, comments, future topic suggestions, or just want to say hi, please send me a note at frank.y.wang@dell.com.

--

--

Frank Wang
Frankly speaking

Investor at Dell Technologies Capital, MIT Ph.D in computer security and Stanford undergrad, @cybersecfactory founder, former @roughdraftvc