The Case For Metadata Privacy

Dr. Sebastian Bürgel
HOPR
Published in
7 min readApr 7, 2021

At HOPR our mission is to change data privacy for good by introducing a decentralised, privacy-preserving messaging network to the Internet. This network is designed to plug a glaring privacy hole baked into the fabric of cyberspace: one that allows observers to spy on Internet users simply by observing their metadata, and that works no matter what other security precautions users might take.

While we see this hole as a serious threat to personal and commercial privacy, this is a fairly technical subject. The average user may not be aware of what is happening behind the scenes as they surf the Internet, or how exposed they really are.

In this post, we take a deep dive into this problem of “metadata surveillance”. The goal is to raise awareness of the issue and also provide some background as to why we have built HOPR and perhaps some motivation for individuals and companies to join our network and thus become part of the solution.

(If you want more on this subject, as well as the rest of HOPR, we recommended reading our recently published Book of HOPR, which provides details on the issue as well as what we propose to do about it.)

What do you mean by a glaring privacy hole in the fabric of cyberspace?

Even if you are taking precautions, the Internet is spying on you. Photo: Christin Hume; unsplash.com

The problem is as follows.

Every time we use the Internet, either to visit a website or to communicate with someone or some entity, we need to send data to our destination and, usually, receive data back. To accomplish this, each data packet has to contain information about where it’s going.

While the analogy is not perfect, this can be thought of like a physical letter or package that needs to be placed in an envelope or box and be addressed. While the contents of the envelope are meant to be confidential, the sender and recipient address on the package needs to be public and easily readable by the postal service or private companies responsible for correct delivery.

The address part of an Internet data packet, the necessary information about the information we are sending, is referred to as metadata. Without it, the Internet wouldn’t work. The problem that most people are unaware of is that this metadata can reveal a shocking amount of information about us. This includes things like our IP address, who we are communicating with, when we are doing so, where we and the receiver are, and often how much data we are sending, and so on. In addition, an IP address provides a fairly unique identifier that can be linked across services, websites, and even devices. Ever wondered about that weird ad related to a search term that your roommate was looking up on the internet? That’s an example of metadata on the internet incorrectly linked to you.

There are two aspects of this that are particularly important to keep in mind. One, no matter how protected the data itself is — for example by using strong encryption — the metadata that comprises the digital envelope is easily readable by all who handle it. Otherwise, it would serve no purpose. Two, compared to the physical world of letters and packages, in the digital world there are a much larger number of entities who are able to handle and observe these digital envelopes, including many who have nothing to do with delivering the contents. Worse, many of these are already in the business of collecting information on Internet users, either openly or in secret, and using it for their own commercial purposes.

Ok, but is this really that bad?

It is. Consider this.

You visit a website with an embedded YouTube video, something that most people probably do every day. When that happens Google, which owns YouTube, will know about the visit even if you don’t click through to the video. By itself, this information is not so interesting. But if you have a Google account, which is also very likely, then Google probably already knows your IP address, and so can easily include your visit in the dossier it has already been collecting on you. The same thing happens with most website plugins, from Facebook, or Medium, or any chat platform for example. Keep in mind too that this type of surveillance does not require cookies or any extra code. It is just how the Internet works.

It also cannot be stopped. We found this out ourselves when we were writing the privacy policy for our own website. As privacy advocates, we naturally wanted the HOPR site to be as privacy-preserving for visitors as possible. But because we use YouTube and Medium, we found out that we couldn’t stop them from collecting our website user’s data. And so we were forced to add this language to our privacy policy:

“Unfortunately, today’s web is ill-suited at protecting your connection metadata and therefore some personal metadata such as your IP address, from where you access our Websites, when and using what devices and software does leak to a range of third parties.”

That was a sobering lesson, and we are professionals in this business.

If that is not enough to convince you of the severity of the issue, perhaps the fact that metadata is considered highly valuable and revealing by many of those whose business it is to understand these things might do the trick. As Edward Snowden revealed, surveillance-oriented entities like the NSA think metadata is so valuable they do all they can to collect as much as possible. In a different context, but with similar thinking, the Court of Justice of the European Union has said that under certain circumstances IP addresses can be considered personal data because they can be correlated with other pieces of information to reveal who you are.

The latter is perhaps the most important point: metadata may seem innocuous, but it is often the skeleton key that can be used to unlock information that we want to protect because it is private. This is a major problem, and not just for individuals, but for companies and society too.

Why exactly am I, companies, and even society at risk?

Thanks to metadata, our Internet traffic is like an open book for many. At HOPR, we are providing a safe cover. Illustration: HOPR.

One of the things we hear most frequently when talking about data privacy is the argument “well if I have nothing to hide, why should I care who is collecting information on me?

Sure, you might not care if Google monitors your search queries and tailors ads to you. You may even like it. But that is just the online commercial part of your life. Would you really be so sanguine if you knew that Google (or anyone else) is also monitoring your private life, for example, the state of your health?

In the Book of HOPR we present a scenario in which, thanks to metadata, it is perfectly plausible under certain circumstances for an outside observer who has been collecting data on you to monitor in real-time your visit to a doctor for a consultation. Not only could your taxi ride to the doctor’s office be recorded, but your time in the office will also be noted and if your doctor is using a medical device that is connected to the cloud (rather likely these days), it is perfectly plausible that this same observer is watching that device. It not only knows what kind of a device it is (a heart monitor, for instance), it also knows when it is sending out data about you (because it can observe that the device is sending data at the same time that it knows you are in the office) and can therefore conclude, accurately or inaccurately, that you have an issue (such as heart disease).

The point is that, even without knowing what the underlying data is, an observer can use metadata to unearth a lot of very personal information. Make no mistake: thanks to these approaches, the intimate details of your life can be and are being spied on.

This brings up another aspect that many people are not considering: it is not only humans who use the Internet, but an exploding number of IoT devices, many of which perform important functions for society.

As we also point out in the Book of HOPR, that means it’s not only people who are at risk. Companies who want to protect trade secrets can just as easily be spied on, making metadata a serious attack vector for corporate espionage. As we found out with our website, metadata surveillance also makes it difficult to comply with Data Protection regulations. Metadata surveillance is a compliance risk too.

Last but certainly not least, with so much of our critical infrastructure — think energy grids or air traffic control systems — relying on IoT devices that can also be spied on by terrorists or hostile governments, metadata surveillance is a security risk as well.

Yikes! I had no idea

Yes, we understand. This aspect of data privacy is currently under the radar for most of us.

But awareness is growing. And we believe over time individuals, companies and authorities are going to want a solution to this problem. At HOPR, we have jump-started that solution and done so in what we believe is a pragmatic, democratic, user-centric way.

There is nothing radical about what we are doing. As we hope we have made clear in this post, we are rather providing a way for people and organisations to protect their basic right to privacy and to protect our economy and society by creating a safe and secure Internet by and for us all.

--

--