Tracking people across multiple domains — when cookies just aren’t enough

Document analytics & Contently analytics — a story of two codebases

Evan Carothers
Building Contently
6 min readJul 8, 2016

--

Over the last few years, marketers, advertisers, and analytics companies have been increasingly challenged with the technical complexity of tracking users and engagement across multiple web properties. With Contently’s acquisition of Docalytics, a startup of which I was a co-founder and engineer, integrating the two products required us to face this challenge head on. This post explores the existing technologies that try to solve this problem, their shortcomings, and the solution we developed to allow us to share data across domains and publications.

Our initial challenge was integrating the analytics captured from the hosted document viewer provided by Docalytics, with the robust tracking Contently provides via our content analytics product. After combining these products, our customers now have blogs, publications, and documents that are served across multiple top-level domains and subdomains. The data and engagement for the people viewing this content should follow them across publications, so we can properly track things like uniqueness and combined attention time. To get to the root of this problem we must solve for the base level need this functionality requires — How can we provide a standards-compliant, stateful, resilient mechanism to store data for a browsing session that is shared across domains?

So… why not just use cookies??

Cookies are the go-to method for tracking user information in a web client. First-party cookies (cookies set on the current domain you are browsing) allow tracking for data on a single domain or subdomains, so they will not work across top-level domains. 3rd part cookies (cookies set on a domain other than the one you are browsing) would work in this context to allow us to share and persist the cookie across domains, however they are associated heavily with advertisers and many users have 3rd-party cookies disabled in their browser, or they get blocked/deleted by plugins and programs that are designed to stop advertising. Based on recent research, as many as 40% of 3rd-party cookies are never set due to content/privacy settings, or deleted by ad-block and anti-spyware programs!

As an alternative solution we also considered using a combination of client/server side identification with ID fingerprint factors such as IP address, user agent, device, etc — but this method leaves too much room for error. As browsers upgrade user agents change, IP addresses for multiple machines on a network can appear to be the same when run through a proxy and change as they move networks, etc — we needed a reliable way to identify a user 100% of the time across networks with tolerance for the previous factors. Additionally, we wanted the solution to be completely client side -as a third party script provider, when our client’s traffic increases so do the number of responses for our tracking script, and we didn’t want to rely on scaling a complicated server-side solution.

Enter cross-domain cookie tracking

As a solution, I developed contently/xdomain-cookies. It takes care of all the needs described above and provides a simple, scalable, easily-deployed solution to extend the functionality of cookies across top-level domains.

The solution consists of a combination of a third-party javascript file you include on the webpage and a hidden iframe that’s placed on the page. The iframe is hosted on a domain that you control and is a raw HTML page that has no server-side dependencies, and can be hosted easily on a service like Amazon S3. The library leverages postMessage for communication between the page & iframe, and cookies are set and persisted on the iframe domain. You can then leverage the iframe/library across any TLDs, pulling the iframe from the same domain you control, and get access to the persisted cookies.

Consider the following example: You want to share a piece of information (such as a user ID) across two domains, www.foo.com and www.bar.com. With xdomain-cookies, you can host the iframe HTML at a domain you control (such as xdomain.example.com), then place the xdomain-cookie script on foo.com and bar.com. When the script is loaded on either of those pages it will insert a hidden iframe that loads the file xdomain_cookie.html from your domainxdomain.example.com then leverage the exposed APIs for getting/setting cookies from the xdomain script to read/write that user ID cookie that will persist and be accessible and persistent across both foo.com and bar.com

Reading & setting cross-domain cookies

The xdomain-cookie library supports a very simple async API for working with the cross-domain cookies — namely a set and a get function that that allow you to query for existing cross-domain cookies and set the value(s) for those cookies. They are stored as JSON serialized payloads, with a local cache of the cookie (on the parent TLD) that can help speedup lookup time. The library handles resolving the .get() request based on the availability of local cookie cache value, or requiring round-trips to the iframe, and also handles ‘touching’ of the cookie to verify persistence.

Under the hood, the script loads the iframe (with a namespace to prevent collision of the postMessage calls), which then reads all cookie data and posts that back to the parent page script as a key/value hash. The local script can return cookie values immediately upon loading if a local cached value is stored else responds asynchronously once the iframe has loaded and passes it’s cookie data value to the page script. Setting a cookie writes the local cookie cache and postMessages the iframe to update the persistent cross-domain cookie value.

Security & implementation concerns

This mechanism provides a great way to share information across domains, however it’s important to note that the model does not provide any security beyond what’s present from normal client-side cookies, and sensitive user data should never be stored via xdomain-cookies (which is true for all client-side cookies as well)

Additionally, since the script/iframe communicate via postMessage it does create a small level of chatter on that channel, which can sometimes cause issues if other scripts are listening/responding on postMessage events. The xdomain-cookie library enforces a pre-known data structure and namespacing for the postMessage calls, assuring that it only acts when messaged by the library directly.

Use cases

There are multiple applications and reasons that you might want client-side data to persist and be readable across domains. At Contently, we leverage the xdomain-cookie library for a few different tasks that it is well suited for. Firstly, we use it to have a shared user identifier to correlate analytics tracking data across publications for the same customer that can be hosted across different TLDs. This use case has multiple applications if you are writing 3rd party tools that use on-page website widgets, have customers that install across multiple TLDs, and want to track a user across all those domains.

Additionally, we leverage a cross-domain cookie to prevent analytics tracking of our logged-in users across their own publications. When a user logs into Contently we set a do-not-track cookie that contains info about the publications they belong to, then can read that cookie on the publications themselves and know that the user is viewing their own content and should not be tracked.

Feel free to contribute

The library is open source, has a fully-featured test suite built around it, and is proven and tested in production, so feel free to contribute and use in your own projects as needed. At Contently, we know that open-source software is the backbone of the tech world today and understand the value of sharing knowledge.

We are also committed to continuing to release some of our tools, utilities, and solutions to the wider development world so look for more fun stuff to come in the future!

--

--