Meet Your Digital Shadow

Rajesh Narayanan
9 min readJun 21, 2018

--

You’ve been meaning to do some private research work. As you begin looking up things in Google you recall that there are better alternatives when it comes to online privacy.

You notice a few interesting search results. But you’re also smart enough to notice they’re insecure. After avoiding these, you find a few legitimate one’s & click on the first search result. You’re now officially off the safe realms. But you’re no ordinary user. You recall reading about the counter measures to pesky online trackers.

You gather some information and move to the next source. After half a day of scouring articles you decide to take a break. Unbeknownst to you, you were not alone despite all your safeguards.

Two days later when you’re on Facebook you notice an advert concerning the exact same item you were researching on. Congratulations, you just met your Digital shadow!

What is a Digital Fingerprint

Image Source: Shutterstock

Let me begin this section with an assumption that you are well aware that a fingerprint is a combination of ridges & bifurcations. Given the incredibly huge number of such possible patterns & the variance in size and shape of human fingers, it is unlikely that no two subjects share the same fingerprint EVER.

A Digital Fingerprint (DF) was initially just an attempt by us to play god in the realm of digital media. It was meant to be a tag for original digital creations.

While a good DF solution could possibly track the copied digital content across the world wide web, the advancements in this space in recent times has been quite high. So much so, that new age fingerprinting solutions are no longer limited to media tracking.

Of late, the subject being identified is something much more valuable to the big corporate entities. Mr. Zuckerberg could not have said it any better:

When Facebook was getting started, nothing used real identity — everything was anonymous or pseudonymous — and I thought that real identity should play a bigger part than it did (Reference: BrainyQuote)

Watermark vs Digital Fingerprint:

In the past, when the question being asked was “how do I protect my digital creation” someone took inspiration from the world of art and decided that leaving a visible & seemingly indelible signature (or text) was the best way forward. So everyone began watermarking their photos and videos with their URL or name and inserted a copyright tag in audio files and other such media.

But for each watermarking tool created, a better watermark removal solution that sprung up. It did not take the internet too long to shun excessively watermarked media. In fact, such exuberance usually attracted the mischievous spectrum of the audience who gleefully took up the challenge of freeing the content!

Media Source: Petapixel & Petapixel

Needless to say, this did not sit well with the giant media firms, as it directly impacts their revenue. But the internet is too big an audience to simply ignore. So the question now being asked was “how do I track my digital creation”!

Digital Fingerprinting in Everyday Tech

Content Id by YouTube

Imagine your team just took a video of this beautiful flash mob, just moments ago. You upload it to YouTube for everybody to enjoy. You know for sure this will go viral. A day later you find a take down notice from Google.

Image courtesy: EFF

Your content is essentially being quarantined because YouTube’s algorithms detected an audio copyright violation or it is also possible that a rights-holder for the song in question (in your region) filed a complaint to YouTube. Behind the scenes, Google runs a Content ID tool that checks each uploads against right protected content.

In the case of Youtube, the Digital Fingerprints are compact digital impressions extracted from the original content (audio or video) which represent contents’ characteristics and have enough details to identify a content variant upon comparison (Source: Internet)

This is one of the several checks Google does to keep this world class site free of spam while ensuring the creators are rewarded for their original content. That being said, this is a human made algorithm and there could be a very good chance of false positives.

Google, therefore, has take-down and dispute processes to keep such errors in check. While there is a chance that watchdogs may abuse this a bit, the intent of this concept is, in their eyes, apt for their chosen business model.

Acoustic Fingerprint by Shazam

Image Courtesy: Techhive.com

Many of us would have enjoyed this lovely app that lets us identify any song based on few seconds of audio.

Just like how a person can identify any given song with album + song name combination, apps like Shazam use frequency matching (against time) of an audio file to create an acoustic fingerprint for the audio.

There is a whitepaper available for those keen in understanding the nuances of how it works. Let me paraphrase one critical point from this:

The database files and audio file are subjected to a fingerprint analysis. The fingerprints from the unknown sample are matched against a large set of fingerprints derived from a music database.

I believe their take is that, no two songs will share an exact same fingerprint given a 10 sec timeline. This way it becomes easier for computer programs to lookup the query audio file in an incredibly fast way.

Browser Fingerprint by Facebook

If you were concerned about recent FB privacy issues and, yet, have not skimmed through Facebook’s detailed response to the US Senate, please do so. I do not want to argue whether what they are doing is good or bad.

It is well established that their business model involves monitoring and tracking users even after they left their website. Its now up to you to decide whether you wish to remain aligned to such a model.

It is not unusual for information sharing to happen in day-to-day browsing. Many companies do this and put it somewhere in a lengthy terms and condition statement. What companies like FB do is to regularly pull out information from their partners and establish an abstract person. They then connect you to that abstraction using some indelible browser footprints.

Pixel, also known as a web beacon or web tag, is a technology that allows the social media company to gather information for advertisers about what happens after a user clicks on a Facebook ad, meaning it is able to continue to track a user’s movement after he or she leaves Facebook (Source: Fortune)

They now rely on more than just IP addresses to track you online. They create a miss-mash using hard to manipulate data like screen size & color depth combo, browser plugin details, time zone, system fonts and platform to derive a browser fingerprint.

Image Credit: Internet

Device Fingerprinting and Unwarranted Spying

We have come a long way from trying to establish copyright on digital audio to selling personalized ads using the same technology. When ad companies can identity you to a zone of users, you can expect that government security agencies are miles ahead! In all likelihood, they don’t just use browsers, but perhaps your device itself as a tool to identify you. And yes, they would already be able to pinpoint you with a military grade accuracy, should the need ever arise!

Remember that something as simple as an User Agent string contains about 10.5 bits of identifying information, meaning that if you pick a random person’s browser, only one in 1,500 other Internet users will share their User Agent string (Source: EFF Technical Analysis)

Imagine the repercussions this kind of unwarranted tracking will have on whistle-blowers or human rights activists in a military regime! Now consider the fact that fingerprinting is possible even in JavaScript APIs like WebGL or pixels on your screen simply with assistance from fonts you use and fundamental HTML5 specifications! It is no longer a questionable argument when I say “you are being tracked”!

While you may currently not feel like Enemy of the State material, the implications for such unchecked tracking are manifold. You may be a good tax paying citizen who is to-date on bills. Yet your scrutiny can begin for the flimsiest of reasons with little or (most likely) no warrant. You can rest assured that there won’t be any criminal/ cyber laws aiding you whilst you scream at the top of your lungs that your Google search was innocuous & was only meant for research purposes!

Ridding your Shadow

If you consider the recent laws concerning cookie consent as a big win, think again! Internet cookies don’t identify a user - they identify a computer & are usually stored locally, not on a server. While GDPR is a step in the right direction and has ensured to an extent that, where the purpose of fingerprinting is tracking people, it will constitute “personal data processing”.

While it has created awareness on browser fingerprinting, you still need to do more to keep yourself protected against your unwanted evil digital twin. Companies will continue to do whatever it takes to separate you from the crowd, so an admission of guilt will be quite hard to come by.

Here are few recommended methods to safeguard yourself from browser fingerprinting (it is a start)!

Use browsers that aim to protect your privacy
  1. You may have heard that Tor browser goes the extra mile by restricting fonts in websites and also warns against potential Canvas fingerprinting. The caveat is that Tor browsing can be considerably slow. Try using my earlier recommendation instead.
  2. Alternately use Firefox with containers that helps provide a containerized contextual identity that helps isolate websites from one another. Mozilla even came up with a dedicated Facebook container plugin!
  3. Install Privacy Badger. I like it because it uses a smart learning algorithms and its own vast database to identify and protect you from all possible tracking sources. It is built to get smarter over time, so give it a while.
  4. Use additional tracking protection plugins in conjunction with Privacy Badger. This should preferably be the one’s you are familiar with. If you are not aware of any please check essential one’s like uBlock Origin or Ghostery.
  5. Consider disabling JavaScript. This is an extreme measure and might kill off usability of most sites.

You can try the web tracking test at EFF’s Panopticlick tool & once you get the result, click on the “Show full results for fingerprinting” link for further details. Don’t take a good score as conclusive proof that your browsing is private!

A Case for Worry?

The initial premise of this article was not my imaginary overstretch. It is a genuine real world example. It is entirely up to you to define and decide to what extent you will continue permitting this.

In this connected world your trust isn’t your own to break. You can stop the social media giants from knowing just your name and age or continue sharing your entire address book with birth date and home address of all your friends!

Digital Fingerprint can come in handy for a wide variety of use cases.

Source: Internet

No concept is black or white. It is the shades of grey that we need to pay attention to.

Used wisely everyone enjoys the best that digital media can give. A little over-spill can substantially diminish the online experience that we have all come to take for granted. In the end it all boils down (once more) to how much you value your online privacy.

While the top media companies & ad agencies keep upping the game, you no longer have a reason not to protect yourself, given the recent increase in affordable/ better tools. The thin red line that separated ad firms from probing into your online privacy is slowly vanishing.

With the ever growing dependency on the internet it may be prudent to educate yourself frequently in this space. The prize is no longer being given for identifying and protecting original digital data. The rewards are way richer when the data identifies you!

--

--