What’s My Fingerprint?

useradd_deploy
12 min readDec 7, 2017

--

Now that I’m using a VPN that I trust, I’m turning my attention to better protecting my privacy and anonymity, starting with browser and device fingerprinting.

It doesn’t take much to identify us and our devices. In the real world, all it takes to uniquely identify most Americans is just three items: your birthdate, gender and zip code. Our devices reveal far more bits of information and that’s even without considering cookies.

To better understand fingerprinting, this post looks at a snapshot of my Chrome browser’s fingerprint on my Mac as it stands right now. (My iPhone generates even more unique identifiers.) I’ll start with test results from EFF’s Panopticlick. In future posts, I’ll look at additional test results from other similar tools and consider what we can do to defend against fingerprinting.

The EFF has been running Panopticlick since 2010. It tests if your system is uniquely configured — and thus identifiable — and analyzes how well your browser and add-ons protect you against online tracking.

According to a 2010 study by EFF’s Peter Eckersley, more than 80% of the browsers visiting Panopticlick had “an instantaneously unique fingerprint.” The typical browser revealed 18 bits of information, meaning that if we pick a browser at random, it’d likely be unique among more than a quarter million browsers (2¹⁸ = 262,144). That — combined with a device’s IP address (or even just the identity of the ISP) — is probably enough to identify and track the device.

When I run Panopticlick, it reports:

Your browser fingerprint appears to be unique among the 89X,XXX tested so far.

Currently, we estimate that your browser has a fingerprint that conveys at least 19.XX bits of identifying information.

Panopticlick tests fourteen items. Listed in order of greatest to fewest bits of information (for my results), these are:

  1. Hash of canvas fingerprint (10 bits)
  2. User Agent (9 bits)
  3. Browser Plugin Details (8 bits)
  4. Hash of WebGL fingerprint (7 bits)
  5. HTTP_ACCEPT Headers (6 bits)
  6. System Fonts (5 bits)
  7. Time Zone (4 bits)
  8. Screen Size and Color Depth (3 bits)
  9. Platform (3 bits)
  10. DNT Header Enabled? (1 bit)
  11. Language (0.9 bits)
  12. Touch Support (0.6 bits)
  13. Limited supercookie test (0.4 bits)
  14. Are Cookies Enabled? (0.2 bits)

Let’s look at these items closer.

1. Hash of canvas fingerprint (10 bits)

Canvas fingerprinting is insidious. It doesn’t rely on cookies or anything stored on your computer. It’s not anything that an average user would ever anticipate . It’s hard for even knowledgeable users to detect or prevent. Yet it enables adversaries to identify and track us across the web.

The way canvas fingerprinting works is that a website makes your browser run a script to process an image like this.

Due to differences in how devices render, smooth and anti-alias fonts and images, different devices will draw the same image differently. Because all this automatically takes place instantly and invisibly in the background when your browser loads a webpage, you’ll never know that this is happening (unless you’re a researcher and know how to search for these types of scripts).

When Panopticlick tests my browser, it reports a 32-character hexadecimal hash like this representing my browser’s canvas fingerprint.

37610605b77d567de7768ad12e288dea

According to Panopticlick, this hash is sufficiently rare that it conveys 10 bits of information. (For the sake of simplicity and anonymity, I’m using rounded numbers.)

Bits of information

To understand what 10 bits of information means, it’s helpful to understand a little information theory. In this context, bits of information refers to entropy, a measure of certainty about information. We can think of entropy as a value that captures how many possibilities a random variable can have. A fair coin that’s about to be flipped has two possibilities — heads or tails — so it has 1 bit of entropy. Rather than using the technical term entropy, we could say that a fair coin has 1 bit of information.

This concept of bits of information can be used to measure how much knowing certain items increase the possibility of revealing someone’s identity. Let’s imagine we’re trying to pick out someone from a crowd when we’re armed with only a few items of information about the person.

In the same way that a fair coin (which has two possibilities) carries 1 bit of information, a person’s gender (which has two possibilities) also carries 1 bit of information. Since we’re trying to identify someone among a large group of people, knowing that person’s gender (let’s say female) allows us to rule out half of the population that’s male and focus only on the half of the population that’s female.

Let’s imagine that we know something about our mystery woman that has four possibilities. Perhaps it’s her hair color and that there are only four possibilities — black, blonde, brunette, redhead — that are equally and randomly distributed among the population. Gender (which has two possibilities) carries 1 bit of information. Hair color (which has four possibilities) is a more powerful tool and should carry more bits of information. So in this imaginary scenario, how many bit of information can be found in hair color?

Given the number of possibilities for a particular item, we can use this formula to calculate that item’s bits of information (assuming that each possibility is equal and randomly distributed):

log2(number of possibilities) = bits of information

If you don’t have a scientific calculator lying around, just type log2(4) into DuckDuckGo and it will calculate the answer for you, which is 2. So knowing the mystery woman’s hair color gives us 2 bits of information. We now can focus on only women with her hair color (brunette) and exclude all other women.

Now let’s assume that we know something about this woman that has eight possibilities. Let’s imagine it’s her age, which we’ll divide into decades (0–9, 10–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70 and older) and assume for this scenario are equally and randomly distributed among the population. How many bits of information can be found in her age? Since log2(8) = 3, her age gives us 3 bits of information.

Now presume we also know the birthday of the woman we’re trying to pick out of the crowd. A birthday has 365 possibilities. (For the sake of simplicity, let’s ignore leap years and assume birthdays are equally and randomly distributed across the year.) Since we know the birthday of the woman we’re trying to identify (let’s say December 7), we can focus on only those people who share that birthday (1/365 or 0.27% of the crowd) or and rule out everyone else (364/365 or 99.73%). How many bits of information does her birthday convey? Since log2(365) = 8.51175265377, knowing her birthday gives us another 8.5 bits of information.

With all that, let’s get back to canvas fingerprinting. Panopticlick says my canvas fingerprint represents 10 bits of information. Since 2¹⁰ = 1024, that means that canvas fingerprinting likely allows my browser to be picked out from a group of 1024 browsers. Phrased differently, fewer than 0.1% of browsers share this canvas fingerprint. That goes a long way towards identifying and tracking my Chrome browser across on the internet.

Canvas fingerprinting is sufficiently powerful that I’ll revisit it in greater detail in a future post. For now, let’s move onto the next item.

2. User Agent (9 bits)

A browser’s User Agent is the string that the browser passes to the website to identify itself and its operating system.

According to Panopticlick, my browser’s User Agent conveys more than 9 bits of information. In other words, this item is enough to pick my browser out from more than 512 browsers (2⁹ = 512). Put differently, fewer than 0.2% of browsers transmit my particular User Agent.

So what is my User Agent? It’s nothing all that special.

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_1X_X) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.X.XXXX.XX Safari/537.36

Since I hadn’t paid any attention to User Agents before, I was wondering why my Chrome browser is telling websites that it’s Mozilla or Safari. There’s an interesting story behind it, which you can read for yourself. And if you want to check your browser’s User Agent and understand what its terms mean, just click here.

Basically my User Agent is telling websites two things: I’m using a particular version of Chrome and I’m using a particular version of macOS. The former isn’t that much of a big deal; Chrome makes up nearly 60% of desktop browsers in the U.S. The latter is more revealing; macOS accounts for roughly 20% of desktop browsers.

According to one guy who tracks the User Agent values of browsers that visit his website, my precise User Agent represents about 2.5% of his visitors. Given that 20% of 60% equals 12% and that knowing my precise Chrome and macOS versions reduces this percentage even more, Panopticlick’s estimate strikes me as reasonable.

3. Browser Plugin Details (8 bits)

A browser plugin is a piece of software that interfaces with the browser. Examples from the recent past that you might recall include Flash, Java and Silverlight.

According to Panopticlick, my Browser Plugin Details are:

Plugin 0: Chrome PDF Plugin; Portable Document Format; internal-pdf-viewer; (Portable Document Format; application/x-google-chrome-pdf; pdf). Plugin 1: Chrome PDF Viewer; ; XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX; (; application/pdf; pdf). Plugin 2: Native Client; ; internal-nacl-plugin; (Native Client Executable; application/x-nacl; ) (Portable Native Client Executable; application/x-pnacl; ). Plugin 3: Widevine Content Decryption Module; Enables Widevine licenses for playback of HTML audio/video content. (version: 1.4.X.XXXX); widevinecdmadapter.plugin; (Widevine Content Decryption Module; application/x-ppapi-widevine-cdm; ).

Panopticlick says that my Browser Plugin Details represent 8 bits of information. That means that these details represent fewer than 0.4% (or 1 out of 256) of browsers.

Users used to be able to modify Chrome by turning plugins on and off. Because Google’s Chrome team decided that the NaCL and WideVine plugins are considered an integral part of the browser, Chrome deprecated the chrome://plugins/ setting, meaning that users can no longer disable these plugins. From the point of view that it’s better to look like everyone else than to stand out, it’s not necessarily a bad thing that the most popular browser locks its Browser Plugin Details. That makes it easier for a single user to blend in with the herd.

It’s an interesting question whether my Browser Plugin Details offer any more information than my User Agent already reveals. It appears to me that the information that my Browser Plugin Details conveys may simply be subsumed within the broader and more granular information in the User Agent.

In that regard, my Browser Plugin Details includes a 32-character hash for Plugin 1: Chrome PDF Viewer. While I Xed out that hash above in an abundance of caution, it appears to be the standard hash for the Chrome PDF Viewer. That suggests that this hash isn’t personalized and doesn’t present privacy issues.

4. Hash of WebGL fingerprint (7 bits)

WebGL (short for Web Graphics Library) is amazing. From a technical standpoint, it’s a JavaScript API based on the OpenGL 3D graphics standard, which gives JavaScript access to graphics hardware via the HTML5 canvas element. From an average user’s viewpoint, it’s what allows us to experience beautiful interactive 3D graphics on the web.

Just like canvas fingerprinting, WebGL fingerprinting can be used to identify and track devices. According to Panopticlick, a hash (7c274dcad6124177b45ac24b21a16147) of my WebGL fingerprint yields approximately 7 bits of information. That narrows my Mac down to 1 in 128 or fewer than 0.8% of browsers.

5. HTTP_ACCEPT Headers (6 bits)

The HTTP_ACCEPT headers that a browser offers are part of the content negotiation between the browser and the server. Per Panopticlick, my Chrome browser offers this header:

text/html, */*; q=0.01 gzip, deflate, br en-US,en;q=0.9

This header communicates that Chrome accepts HTML content (text/html) and all media types (*/*) with a quality value (q=0.01), three compression encoding schemes (gzip, deflate, br) and American and British English (en-US,en) with a quality value (q=0.9). (For whatever reason, the content headers that Panopticlick reports differ somewhat from those reported by What Is My Browser.)

According to Panopticlick, the HTTP_ACCEPT headers represent 6 bits of information, meaning that one in 64 browsers (or 1.5%) share the same headers as my browser.

6. System Fonts (5 bits)

Until I started looking into fingerprinting, I hadn’t realized that every website I visit can get a list of the fonts on my Mac. I also hadn’t paid attention to the fact that the fonts I had collected were enough on their own to uniquely identify me as I browsed the web.

The first time I used Panopticlick, it reported that my System Fonts represented roughly 20 bits of information, far more than any other item. Since 2²⁰ = 1,048,576, that meant that my fonts alone were sufficient to pick my Mac out of more than a million other devices.

As much as I appreciate typefaces, I value my privacy even more. To reduce the value of fonts in fingerprinting me, I restored my fonts to the macOS standard fonts. That was easy: open Font Book, select File and then Restore Standard Fonts. That command restores the standard system font configuration and removes from the system font database any nonstandard fonts not included in the macOS system install, which it places in a “Fonts (Removed)” folder next to the Fonts folder. Now whenever I want to use a nonstandard font, I install the font and then remove it when I’m done. This simple step makes it tougher to identify and track me online.

7. Time Zone (3-5 bits)

Because people and devices aren’t distributed randomly across the Earth’s 24 time zones, different time zones yield different bits of information. According to Panopticlick, the following time zones convey the following bits of information:

  • EST (UTC-5) = 3.85 bits
  • CST (UTC-6) = 5.45
  • MST (UTC-7) = 3.91
  • PST (UTC-8) = 3.58

While I’m not going to be changing my system time, someone running a VPN with an IP address that’s in another time zone might change his or her system time to match the time zone associated with the VPN’s IP address.

8. Screen Size and Color Depth (3 bits)

Detecting the size of your device’s screen and how many colors it can display is basic to the functionality of many websites, which need this information to determine how to layout webpages.

Panopticlick says that my Mac’s screen size and color depth provides about 3 bits of information. While neither time zone nor screen size and color depth offer all that much information alone, these items are additive. Since time zone fingerprinting (4 bits) narrows my Mac down to 1 in 16 and screen size and color depth fingerprinting (3 bits) narrows my Mac down to 1 in 8, combining these two items (7 bits) narrows by Mac down to 1 in 128 (2⁷ = 128). The more fingerprints an adversary can collect, the easier the adversary can track someone on the web.

9. Platform (3 bits)

Overall there are four broad possibilities for desktop platforms: 1) Windows (73%), 2) macOS (21%), 3) Chrome (4%), 4) Linux and all the rest (3%). Because it’s most common, knowing that a device runs Windows is least surprising and conveys the least information. Because it’s rarest, Linux, OpenBSD and other platforms are most surprising and provide the greatest information. macOS falls in between these extremes. According to Panopticlick, detecting a macOS device captures roughly 3 bits of information. Combining platform (3 bits) with time zone (4 bits) and screen size/color depth (3 bits) further narrows my Mac down to 1 in 1,014 (2¹⁰ = 1024).

10. DNT Header Enabled? (1 bit)

Using your browser to send a Do Not Track signal seems like a good idea. It was proposed in 2009. After Apple, Microsoft and Mozilla added this feature to their browsers, Google finally added it to Chrome in 2012. Unfortunately, it doesn’t do anything besides telling advertisers that you don’t want to be tracked (which offers them another way of fingerprinting you).

According to Panopticlick, setting the Do Not Track header to True reveals 0.81 bits of information while setting it to False reveals 1.22 bits. That would suggest that the majority of users use Do Not Track; however, that is not the case. Only about 10–20% of browsers use DNT. Mozilla even appears to have discontinued its DNT Dashboard reporting DNT’s adoption among Firefox users (which was at 13% among desktop Firefox users in 2016), indicating that DNT adoption may have stalled out. So Panopticlick’s data suggesting that enabling DNT is better for privacy than disabling it is likely skewed. The reason is simple: Panopticlick measures reports your uniqueness relative to the population of users visiting Panopticlick (who are are already concerned about their privacy and so are more likely to turn on DNT) rather than the population of all users (90% of whom disable DNT).

Either way, using the DNT setting adds about 1 bit of information. Combining that with time zone (4 bits), screen size/color depth (3 bits) and platform (3 bits) narrows my Mac down to 1 in 2,048 (2¹¹ = 2048).

11.-14. Language (0.9 bits), Touch Support (0.6 bits), Limited supercookie test (0.4 bits) and Are Cookies Enabled? (0.2 bits)

The four remaining items collectively comprise approximately 2 bits. Combined with time zone (4 bits), screen size/color depth (3 bits), platform (3 bits) and DNT header (1 bit), these items narrow my Mac down to 1 in 8,192 (2¹³ = 8192).

In conclusion, my current browser choice gives out 19 bits of information. That means that even without using cookies, my device likely can be picked out from a half million devices (2¹⁹ = 524,288).

In my next post, we’ll look at another tool — AmIUnique? — that tests browser and device fingerprinting.

F-House No. 1, Statesville Correctional Center, Crest Hill, Illinois

Sources

--

--