A Brief Introduction to Browser Fingerprint Tracking Technology

Frederic Guo
8 min readApr 1, 2022

--

Before we start, let’s check out these two scenarios that you may encounter on the Internet:

Scenario #1: Browsing a product on an online shopping website (such as Amazon) and learning about the relevant products information, but did not place an order to purchase, or even did not log in. After 2 days, when you used the same computer to visit other websites, you found many advertisements for similar products.

Scenario #2: You have multiple accounts in a blog, and these accounts exist to boost the popularity of a certain post or guide public opinion, even when you switch accounts and delete the cookies and your local caches, and reboot your routers, even use VPN to make further operations. You are careful enough, but the web manager may still know that it is the same person operating, then your accounts got banned.

If you have encountered any scenario like the above, you should consider whether browser fingerprinting plays a role in it.

What is browser fingerprinting?

“Browser fingerprinting” is a method of tracking web browsers through the configuration that is visible to websites by the browser. It can track web browsers through the configuration, setting information that the browser sees on the website.

It is like the fingerprints on our hands, with individual recognition/identification. Fingerprints are unique, because each fingerprint has a unique texture, which is formed by bumpy skin. The differences in fingerprint patterns of each person create their unique characteristics.

The browser fingerprints may identify browsers. The information can be identified via User-Agent, time zone, geographic location, the language used, etc. The information developed by the browser determines the accuracy of the browser fingerprint.

For a website, getting a browser fingerprint has no practical value. What is valuable is the user information corresponding to the browser fingerprint. For example, on a content distribution website, user A likes to browse animation-related content, and records this interest through browser fingerprints, so the next time the user does not need to log in, he can push animation-related information to the user A.

Browser fingerprinting background

The browser fingerprint tracking technology has now entered the 2.5th generation:

The 1st generation is stateful, mainly focusing on the user’s cookie and every cookie, requiring the user to log in to get valid information.

The 2nd generation has the concept of browser fingerprinting, which makes users more distinguishable by continuously increasing the feature values of the browser, such as (UA, browser plug-in information)

The 3rd generation has set its sights on people. By collecting users’ behaviors and habits to build feature values and even models for users, real tracking technology can be realized. This part is currently more complicated and is still being explored.

It is currently in the 2.5th generation because the problem that needs to be solved now is how to solve the problem of cross-browser identification fingerprints, and the results achieved in this regard will be introduced later.

Fingerprint collection

Information Entropy is the average amount of information contained in each received message. The higher the information entropy, the more information can be transmitted, and the lower the information entropy, the less information is transmitted.

The browser fingerprint is a combination of the feature information of many browsers, and the information entropy of the feature values is also different. Therefore, fingerprints are also divided into basic fingerprints and advanced fingerprints.

Try them out

  1. View your browser fingerprint ID and basic information:
https://fingerprintjs.com/demo/
Demo of FingerprintJS

You may get something like this:

2. View your HTTP header information:

https://httpbin.org/headers

This one is text-based, showing a JSON

3. View basic information about your browser:

https://www.whatismybrowser.com/

I found out my browser is out-of-date, when I try this one.

Two categories:

1. Basic fingerprint

Browser feature fingerprint acquisition:

System feature fingerprint acquisition:

Time zone feature fingerprint acquisition:

2. Advanced Fingerprint

1) Canvas fingerprint

Speaking of advanced fingerprints, we must mention Canvas fingerprints. Canvas (canvas) is a dynamic drawing tag in HTML5, which can be used to generate and even process advanced images.

The principle of Canvas fingerprint is rough as follows:

The same HTMLCanvasElement element drawing operation, on different operating systems and different browsers, produces different picture content. In terms of image formats, different browsers use different graphics processing engines, different image export options, and different default compression levels.

At the pixel level, operating systems each use different settings and algorithms for antialiasing and subpixel rendering operations.

Even with the same drawing operation, the CRC check of the generated picture data is not the same. Canvas has been supported by almost all major browsers and can be accessed through most PCs, tablets, and smartphones.

2) WebGL Fingerprint

The Webgl object (canvas.getContext(“webgl”)) can be obtained through the HTMLCanvasElement element, through which the user’s hardware information can be obtained, such as the graphics card name, graphics card model, graphics card manufacturer, etc., such as ANGLE (NVIDIA GeForce GTX 1050 Ti Direct3D11 vs50 ps50), Google Inc.

Because hardware is generally not replaced at will, some computers have not been replaced since the computer was scrapped. There are also many types of computer hardware.

Although the collision rate is very large, it can still be used as part of the user’s fingerprint. The more information collected about the user, the more it can represent the unique fingerprint of the user, which cannot be ignored.

3) AudioContext Fingerprint

The Audio API provided by HTML5 for JavaScript programming gives developers the ability to directly manipulate the original audio stream data in the code, and perform arbitrary generation, processing, and reconstruction of it, such as improving the timbre, changing the pitch, audio segmentation, and other operations. It might even be called the web version of Adobe Audition.

The principle of AudioContext fingerprinting is rough as follows:

Method #1: Generate an audio information stream (triangular wave), perform FFT transformation on it, and calculate the SHA value as a fingerprint.

Method #2: Generate an audio information stream (sine wave), perform dynamic compression processing, and calculate the MD5 value.

In both methods, the audio is cleared before the audio is output to the audio device, and the user is fingerprinted without even noticing it.

AudioContext fingerprint basic principle:

Minor differences in host or browser hardware or software led to differences in the processing of audio signals. The same browser on the same device produces the same audio output, and the audio output produced by different machines or different browsers will vary.

It can be seen from the above that the principles of AudioContext and Canvas fingerprints are very similar. They both use differences in hardware or software. The former generates audio, the latter generates pictures, and then calculates different hash values ​​as identifiers.

4) WebRTC Fingerprint

WebRTC (Web Real-Time Communication, Web Real-Time Communication) is a capability that enables browsers to communicate with audio and video in real-time. It provides three main APIs to allow JS to acquire and exchange audio and video data in real-time, MediaStream, RTCPeerConnection, and RTCDataChannel.

Of course, if you want to use WebRTC to obtain communication capabilities, the user’s real IP must be exposed (NAT penetration), so RTCPeerConnection provides such an API, and you can get the user’s IP address directly using JS. The user’s intranet IP address also does not change in most cases, so it is also one of the factors that can be used as a user’s fingerprint.

Comprehensive fingerprint

The above points have been said about the browser fingerprints. It has not been completed, but only a part. However, the scattered fingerprint information cannot truly locate a unique user and cannot be used to represent the unique identity of a user (user fingerprint).

Comprehensive fingerprint refers to the combination of all user browser information, which can locate and identify users with an accuracy of nearly 99%.

The comprehensive fingerprints are rough as follows:

· Basic fingerprint (UserAgent, screen resolution, number of CPU cores, memory size, plugin information, language, etc.)

· Advanced fingerprint part (Canvas fingerprint, Webgl fingerprint, AudioContext fingerprint, WebRTC fingerprint, font fingerprint, etc.)

Geographic location, time zone, DNS, SSL certificate and other information.

Combining the above points can generate a comprehensive fingerprint (user fingerprint), which can reach more than 99% of the above-mentioned unique users.

How to prevent fingerprint technology?

1) Provide simplified fingerprint

Users can minimize the availability of identifying information by selecting a web browser, such as browser fonts, device IDs, canvas element rendering, WebGL information, and local IP addresses.

As of 2017, Microsoft Edge was considered the most fingerprinting browser, followed by Firefox and Google Chrome, IE, and Safari. Among mobile browsers, Google Chrome and Opera Mini are the most fingerprint-ready browsers, followed by mobile Firefox, mobile Edge, and mobile Safari.

Tor Browser disables fingerprint features, such as canvas and WebGL APIs, and informs the user of the behavior of fingerprinting.

2) Provide fake fingerprints

Each time a site is visited, the exposed information is forged or interfered with in different ways to reduce the stability of fingerprint acquisition.

Such as disrupting sound and canvas rendering with a small amount of random noise, a technique adopted by the Brave browser in 2020.

3) Block JavaScript scripts

Blindly blocking client-side scripts from third-party domains, and possibly also scripts from first-party domains (for example, by disabling JavaScript or using NoScript), can sometimes render a website unusable.

The preferred method is to block only those third-party domains that appear to be tracking people because they are found on a blacklist of tracking domains. (method followed by most ad-blocking plugins)

4) Switch between different browsers

Different browsers on the same machine usually have different fingerprints.

But if the two browsers do not carry out relevant fingerprint prevention technology, the fingerprints generated by the two browsers may be recognized as coming from the same machine.

References:

https://zhuanlan.zhihu.com/p/94158920

https://www.clonbrowser.com/blog/everything-you-can-know-about-browser-fingerprints

https://juejin.cn/post/7039635796495712287#heading-13

https://codeantenna.com/a/bP4g4QcanE

--

--