Our Framework for Thinking About Data Privacy Is Outdated

Anmol Parande
Dec 23, 2020 · 12 min read
Image for post
Image for post
Photo by ev on Unsplash

The first concrete notion of a “Right to Privacy” was detailed in 1890 by the lawyers Samuel Warren and Louis Brandeis in their eponymously named essay in the Harvard Law Review. At the time, they were concerned about the proliferation of photographic images, particularly with respect to tabloids and gossip columns. Their essay was a reaction to what they saw as an encroachment of this new technology on the individual’s “right to be let alone” — to have information about themselves known only to those whom they have given it freely. In the 130 years since then, the need for this protection has only increased with the development of novel methods of data collection and storage, in addition to powerful analytical techniques.

As data collection has rapidly grown, so too has the legislation protecting the right to privacy. When the EU wrote the Charter of Fundamental Rights of the EU in 2000, Article 8 clearly stated that “everyone has the right to the protection of personal data concerning him or her”. More recently, the General Data Protection Regulation (GDPR) in the EU and the California Consumer Privacy Act (CCPA) in California have sought to establish stringent, legally binding, and enforceable requirements for how third parties handle the personal information that they have collected.

Though the idea of privacy has been around for 130 years, the legal framework which underpins it has remained largely unchanged. Even GDPR and CCPA, considered the most modern and comprehensive laws surrounding data privacy, build off the idea of “informed consent.” As Warren and Brandeis put it, the “Right to Privacy ceases upon the publication of facts by the individual, or with his consent”. More formally, informed consent is “an agreement to do something or allow something to happen, made with complete knowledge of all relevant facts, such as the risks involved or any available alternatives” (source). The justification for such a framework is that so long as individuals consent to be known and are fully aware of the consequences of that decision, then the law should not restrict how a third party uses the data it collects because it was intentionally given.

Hypothetically, this framework is an ethical manner of balancing privacy concerns with the need for companies to use data to innovate. In practice, however, informed consent is problematized by the scope and scale of data collection and sharing. In other words, modern data privacy regulations give informed consent an outsized role in the jurisprudence of the human right to privacy because informed consent itself is unable to address the information asymmetries regarding data collection between individuals and 3rd parties, as well as the technological developments in big data analytics that complicate its implementation.

What Rights Do GDPR and CCPA Define?

Both the GDPR and the CCPA break down the right to privacy into several components, which are either identical or analogous between the two regulations.

  1. The Right to Access: individuals can obtain any information about them that a third party uses for automated decision making.
  2. The Right to Data Portability: collected information is provided in a commonly-used and machine-readable format.
  3. The Right to Restrict Processing: consumers can put limits on what kinds of data companies can collect from them as well as what they can do with it.
  4. The Right to be Forgotten: consumers can request third parties to delete their personal data. The only time this request can be refused is if the data in question is a financial transaction, required for debugging the service, would violate free speech if removed, or other similar concerns.
  5. The Right to Know the Source of Data and the Business-Use Case for Collection.

Each of these rights seeks to guarantee that individuals control their data rather than the third parties that collect it. They also enforce transparency between data collectors and consumers because consumers know what data a third party has collected on them and what they are doing with it. Violating any of these rights constitutes harm for the consumer, so lawsuits regarding these rights have standing in the court system.

Just as the GDPR and the CCPA break down the right to privacy into different components, informed consent can similarly be decomposed into three main requirements.

  1. Complete knowledge of all relevant facts
  2. Knowledge of the risks involved
  3. Knowledge of available alternatives

Knowledge of the risks and knowledge of the alternatives are, in fact, subcategories of knowledge of all relevant facts but can be considered separate, leaving “all relevant facts’’ to mean anything else that might alter an individual’s decision to give consent. In the specific context of data privacy, examples of relevant facts might be where data is being stored, third parties that data is being shared with, as well as what a third party is doing with data. If satisfied, these three components presume that an individual will use their knowledge to act in their own best interest and will not be harmed by their decision to relinquish their data to a third party. Each of the rights established by GDPR and CCPA, while not explicitly stated in the regulations themselves, is tightly connected with at least one of these three components.

Knowledge Of All The Relevant Facts

Beginning with the right to know the data sources, this right fits squarely into the most basic components of informed consent: knowledge of all the relevant facts. Businesses delineate what data they collect, how they use it, and why they collect it in their privacy policy — a document that most privacy regulations require to be in plain language and easily understandable. Privacy policies required this even before GDPR and CCPA, and companies have been successfully penalized for violations in the past.

In 2019, French courts invalidated 38 clauses of the Google+ Privacy Policy, which shares clauses with Google’s overall privacy policy. Among the complaints were that “Google failed to inform its users adequately about purposes and recipients” of data collection, and that vague terms needed to be replaced with descriptions of data movement in a “clear and exhaustive” manner (source). While this seems like a victory for privacy, successful prosecution does not necessarily mean that data privacy is getting better. In fact, terms like “clear and exhaustive” present a real concern that, ironically, the passage of data privacy regulation can potentially make privacy policies even more complex.

Before GDPR, privacy policies required on average 18 minutes to read and a college-level education to understand; after GDPR, readability has improved to a high school level in some cases, but reading time has increased as a result (source). The college-level education requirement itself excludes 70% of the American population older than 25, disproportionately impacting minorities in the same age bracket (79% of Hispanics and 74% of African Americans) based on the latest US Census Data.

Both the length and complexity of privacy policies make them orthogonal to the idea of “knowledge of all the relevant facts.” For many, there is not even an “ability to know the facts.” The implication is that making a knowledgeable decision about data privacy is reserved for those with the time, education, and motivation to understand what is happening with their data, effectively excluding the many consumers without a college degree, notwithstanding any intersectional correlations with this requirement.

Length and complexity aside, the technical knowledge required to understand data privacy exceeds what most people have, further complicating the requirement of informed consent that consumers know the facts. A study conducted in the United Kingdom found that only 13% of their respondents believed they had full knowledge of cookies, a common online tracking mechanism, while the remaining 87% had not heard of them, did not know how they worked, or only had a limited understanding (see study). If consumers are not technically literate, they are fundamentally lacking the facts which can help them make an informed decision about their data.

The lack of knowledge extends beyond technical literacy to include a general unawareness of how data is used. For example, data brokerage, the practice of selling information to larger data aggregators which then distribute that information to other businesses, has long been shrouded in mystery. In 2014, the FTC published a report about data brokerage, highlighting the lack of transparency surrounding the processing. Notably, they found that “it may be virtually impossible for a consumer to determine the originator of a particular data element” (see report). Although this report was pre-GDPR, the fundamental structure behind how data brokerage works has not changed since the regulation was passed. Data brokers source data from the government, publicly available data such as social media, commercial services such as advertisers, and even other data brokers.

As a result, data travels across an interconnected web of third parties, which, as the FTC found, would make it “virtually impossible for a consumer to determine how a data broker obtained [their] data” since they would have to effectively trace through this web. Expecting this web to be described comprehensively in a privacy policy is not a remedy because it still requires a consumer to trace what is going on behind the scenes.

Moreover, it is an unreasonable demand to make of companies because of what Helen Nissanbaum, a professor of Information Science at Cornell, calls the “transparency paradox”. Namely, there is a fundamental tension between specificity and brevity, as well as comprehensiveness and clarity, when it comes to privacy policies. The impossibility of describing what goes on behind the scenes can arguably lead companies to include broader language, which can grant them greater flexibility while still adhering to the law. In essence, this paradox is what makes true knowledge of all the facts an ideal that can not be achieved with the current technical landscape.

Knowledge Of All The Risks

Perhaps consumers would exercise their right to know the sources and usage of their data, their right to be forgotten, or their right to restrict processing if they believed their data could be used for anything malignant; this apparent assumption of many consumers is critical to understand why data privacy regulations also fail to address the second key component of informed consent: knowledge of the risks. The common perception is that data collected is for the purpose of advertisements and thus is related only to advertisements. This, however, is not the case. Two notable examples that violate this assumption come from data collected by Facebook and Target.

In 2013, researchers found that they could use an individual’s history of Facebook Likes to predict sexual orientation and several other characteristics to a high degree of accuracy (read paper). This was a result of common patterns between individuals of identical sexual orientation, regardless of whether an individual was “closeted” or “out.” The fact that this is possible is a cause for concern because the ability to identify marginalized groups also means the ability to target them. For groups that have historically been discriminated against, this creates an additional danger for them that they might be unaware of. It also takes agency away from individuals who have something about themselves that they would rather keep hidden from those around them.

In 2011, there was an instance where Target was able to predict the pregnancy of a teenage girl before she had alerted her parents based on the similarity of her shopping patterns with other known pregnant women (source). Just as in the case of sexual orientation, pregnancy is a hyper-sensitive piece of personal information that is at the discretion of the individual to disclose. While it is a different argument how legislation could possibly protect against this, it goes to show that it is difficult to determine exactly what seemingly harmless data such as what you like on Facebook or what you buy at Target can predict.

These two cases are not isolated and are emblematic of a documented phenomenon in big data analytics. A study in 2010 found that as few as 20% of users revealing attributes about themselves could help build a model to infer global attributes about a population (read study). In other words, for privacy-minded individuals who do not consent to their data being collected, informed consent is a moot point since data aggregators can still predict attributes about them because of those they are affiliated with.

It is currently unclear how much data analytics can actually predict, but the cases that have brought headlines demonstrate that it is more than most people think is possible. The right to be forgotten established by legislation is a useful remedy in these cases. However, exercising these rights requires individuals to know that they are known by a third party, and, as a result, may not apply to the average person.

Knowledge of Available Alternatives

The final piece of informed consent which modern data privacy legislation fails to address is the knowledge of available alternatives. The right to data portability and the right to restriction of processing on their face seem to address this concern. After all, if no competitor can penalize users for denying their data, then users can freely move between alternatives.

However, this neglects the fact that for most websites, trackers, cookies, and fingerprinting are default opt-in and non-negotiable. The EU Cookie Law required users to grant permission but never mandated that users can freely use a site without giving up that information (source). Most sites, when asking users to accept their tracking and cookie notice, make it impossible for site usage if the user does not wish to consent. This frequently takes the form of a large banner that blocks most of the webpage with the only clickable button being “Accept” or clicking a checkbox to agree to the privacy policy before creating a website account. This, in essence, is retaliation because the alternative is denial of service in its entirety. But nevertheless, it is allowed under the law. This is true even if the data collected is not central to the usage of the service from the consumer’s perspective because companies still require that data for their revenue.

In this manner, the right to restriction of processing is, in a sense, a “false right” in that it theoretically exists but is impossible to exercise in practice. Moreover, while the presence of alternatives is not strictly necessary for informed consent, the lack of alternatives forces privacy to be a secondary concern, and hence informed consent to be a secondary concern, because the primary concern is access to services.

The law cannot directly create alternatives for consumers to use, but it can inhibit their creation by making it prohibitively expensive to comply with the law. It is estimated that a firm of 500 employees must spend $3 million U.S.D to comply with GDPR. Even for small firms, time, personnel, compliance management software, legal fees, and data processing are all costs both small and large businesses must shoulder (source). For small businesses, this is one additional burden they must bear as they try to scale their business. For them, finding the money to pay for the infrastructure required by the law is significantly harder than it is for larger companies with a wealth of resources already at their disposal. In this manner, expensive legislation like GDPR, in general, tends to further entrench the incumbent powers in the market, making the presence of available alternatives even scarcer.

Conclusion

None of these problems with an informed-consent-based data privacy framework are new or particular to the GDPR and the CCPA. In fact, many of the examples given as well as the technological research into privacy were published before the GDPR was drafted in 2016. Yet they are still relevant under the data privacy regimes created by GDPR and CCPA because these two regulations only build upon what already existed. Just because these regulations give governments an ability to prosecute does not necessarily mean that data privacy as a right is actually strengthened.

The facts show that the pre-conditions of informed consent are clearly being violated, and as a result, the right to privacy afforded by modern privacy regulations is, in practice, a facade of privacy that sits in de facto violation as individuals do not have the knowledge to make informed decisions and are beholden to the data demands of 3rd party data collectors in exchange for their services. Moreover, the large predictive capabilities of big data analytics call into question whether consent is still a useful framework in which to view privacy, particularly given that the aforementioned research has shown how a small percentage of people providing their data can enable accurate inference about those who have not consented to be known.

Most importantly, the jurisprudence surrounding data privacy, if it will ever live up to its ideals, can not operate under untrue assumptions as it does today. Whether this requires modifying the assumptions of informed consent, a shift in perspective on privacy rights, or an entirely new framework altogether, what is constant is that any new legislation needs to be flexible enough not to hinder innovation but also stringent enough that only those who want to give up on their privacy can do so. Some of the necessary changes are outside the scope of the law. Beyond providing remedies for those who have been harmed and setting the rules of engagement between individuals and 3rd parties, the law’s impact is limited.

In addition to the legal changes, the technology surrounding privacy requires a radical shift towards giving users truly granular control of the data they provide. It also necessitates a change in the culture surrounding technology and how people interact with it — requiring a higher level of technical literacy and wariness about how comfortable people are relinquishing personal data to others. In other words, a true right to privacy is built not by the law alone, but rather a combination of regulators, designers, and engineers working in harmony to develop systems that are useful, secure, private, and just as easy to use as the technology of today.

The Startup

Medium's largest active publication, followed by +754K people. Follow to join our community.

Anmol Parande

Written by

Student of Electrical Engineering and Computer Science at UC Berkeley

The Startup

Medium's largest active publication, followed by +754K people. Follow to join our community.

Anmol Parande

Written by

Student of Electrical Engineering and Computer Science at UC Berkeley

The Startup

Medium's largest active publication, followed by +754K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store