Tracking prevention in modern browsers

An overview of cross-domain tracking prevention in the modern browser, specifically of Intelligent Tracking Prevention (ITP) in WebKit.

Jonathan Merlevede
datamindedbe
6 min readJul 10, 2020

--

All modern browsers include mechanisms preventing cross-domain tracking. This story summarizes how WebKit deals with cross-domain tracking and tracking cookies. We’ll first go over the need for tracking prevention, and then look at the limitations WebKit imposes on third-party and client-side cookies.

Not this kind of tracking. Source: pxfuel.

The component preventing third party tracking in WebKit is called intelligent tracking prevention (ITP) in WebKit (Safari, iOS apps, …), Enhanced Tracking Protection in Firefox and simply Tracking Prevention in Edge. Chrome currently has the least developed tracking prevention functionality, but has already announced that it will completely phase out third-party cookies by 2022.

As far as I know, tracking prevention is not currently standardized in any way. All mainstream browsers (Edge, Chrome, Firefox, Safari) limit themselves to preventing tracking by third parties. Within the group of mainstream browsers, WebKit’s tracking prevention is the most stringent.

If you’re a bit fuzzy on what a primary domain or what the difference between a first-party and a third-party cookie is, consider refreshing you memory by reading my other story on cookies before moving on.

Motivation and overview

Without tracking prevention, if you embed any element from some domain d, that element can set cookies on d. These cookies are then readable and writable from embeds hosted any page, on any domain, that also embeds something hosted on d. Popular embeds such as Facebook’s like button therefore make it possible for their owners to track people’s browsing behavior across vast numbers of sites: if you visit any site that embeds a “like” button, Facebook knows that you visited that site; they can easily connect this information to your Facebook profile.

The goal of ITP is to prevent such cross-domain tracking. To do this, WebKit puts restrictions on all third-party cookies. WebKit also puts restrictions on all cookies that are set client-side, from JavaScript. I’ll discuss the restrictions on third-party cookies and of client-side cookies in separate sections.

Why WebKit?

I chose discuss tracking prevention in Webkit, because WebKit has good documentation and because it is the leader when it comes to tracking prevention, with other browsers lagging behind a bit but ultimately following suit.

WebKit is a leader in tracking prevention, with other browsers lagging behind but ultimately following suit.

Restrictions on third-party cookies

ITP 1.0, ITP 1.1 and ITP 2.0 restrict the usefulness of third-party cookies by partitioning and purging cookies set on domains that are used for tracking purposes. I’ll save you from an introduction to what partitioned cookies are, because ITP 2.1 gets rid of the concept and simply disallows setting third-party tracking cookies in most (or all) circumstances.

ITP currently (July 2020) works as follows:

  • An intelligent machine learning classification model partitions domains into tracking and non-tracking domains. Tracking domains are domains used only for cross-domain tracking. Non-tracking domains are domains with which users primarily interact as a primary domain. Classification of domains happens on your machine at the level of second-level domains, and is based on your own browsing history.
  • Since March 2020, WebKit has implemented full third-party cookie blocking, making it impossible to use third-party cookies except through the storage access API (see below). Before March, this was only impossible if company.com was classified as a tracking domain. Interestingly, WebKit went for full blocking in part because whether or not it blocked a cookie could reveal information about a user's browser history.
  • If company.com is classified as a tracking domain and has not been interacted with as the primary domain for over 30 days, then all its old cookies are purged. This happens regardless of the cookie’s expiration date.

Since March 2020, WebKit has implemented full third-party cookie blocking, without exceptions.

If you really want to, you can think of ITP as an AI judging your cookies. Image source: Wikimedia.

Storage access API

To support certain features that required third-party cookies, WebKit introduced the Storage Access API (somewhere after ITP 1.0 but before ITP 1.1). This API enables developers to read and write the cookies of a domain, for example, video.example, from sites with a different primary domain, for example news.example. This is possible regardless of whether video.example is classified as a tracking domain. Requesting access to these cookies is possible when there is user interaction with an embedded element loaded from video.example, such as the user actually clicking a like button or playing an embedded video. Since ITP 2.0, users then still have to give explicit consent for this by pressing "allow" in a pop-up dialog; interaction with video.example in this way also resets the 30-day interaction timer with domain video.example. The Storage Access API is not yet a standard, but is already supported in most modern browsers.

ITP 2.0 consent message, taken from the WebKit blog on ITP 2.0.

Restrictions on first-party client-side cookies

Given that ITP makes it impossible to use cookies from tracking domains as third-party cookies without user interaction and an annoying pop-up, how do services like Google Analytics and the Snowplow tracker work? Well, they set tracking cookies on the site’s own domain client-side, using Javascript’s window.cookie-functionality. Such use of client-side cookies is legitimate.

The problem there is that browsing sessions belonging to different domains can be stitched together using various methods, e.g. by embedding the tracking ID of a user on one domain in links to another domain. Trackers were using such techniques to circumvent the restrictions imposed on third-party cookies. ITP 2.1, 2.2. and 2.3 therefore limit the power of client-side cookies, and imposes restrictions on link decoration, bounce trackers and the information embedded in the Referrer header to combat their use for cross-domain tracking.

WebKit’s behavior when it comes to enforcing client-side restrictions depends on whether a domain is classified as a tracking domain or not. Although there is a bit more to it, we can summarize ITP’s current (ITP 2.3) behavior regarding first-party cookies set from Javascript as follows:

  • Since ITP 2.1, client-side set cookies have a maximum time to live (TTL) of seven days. If your website uses Javascript to store user information, then a user returning to your page after eight days of inactivity will appear as a new user rather than a returning user.
  • If a site was reached from a domain identified as a tracking domain or through a link containing a query string (link decoration), ITP 2.2 further reduces the maximum TTL to a single day. This prevents advertisers from stiching together ITP 2.1’s seven-day sessions. ITP 2.3 also removes non-cookie site-writable data, also known as local storage, after seven days, and truncates the info in document.referrer to TLD+1 in requests to tracking domains.

In WebKit, client-side set cookies have a maximum time to live of seven days or, in some cases, a single day.

Conclusion

If found it interesting to read up on tracking prevention and its history. Because third-party cookies were (and are) widepread and have some useful, bona-fide applications, it looks like browsers were initially hesitant to fully block them. However, despite good intentions the opposite seems to have happened. Restrictions put on third-party cookies severely limited their usefulness, while “malicious actors” found ways to circumvent anything that is not full blocking. Now, we are rapidly moving towards full blocking.

I wrote this story in the context of setting up a Snowplow to track users on multiple domains. Although ITP is meant to prevent cross-domain tracking, it also affects how you should set up tracking of a single domain. I’ll release an article on how to set up tracking with minimal impact of ITP shortly, so be sure to follow me if you don’t want to miss it!

I work at Data Minded, an independent data engineering and data analytics consultancy based in Leuven, Belgium. If you need help setting up trackers, feel free to contact us!

--

--