The Ecosystem of Data Protection and Privacy

Abhinav Pathak
The Startup
Published in
8 min readFeb 26, 2021



On a rainy day, Emma is sipping her tea, occupied in her thoughts

She would end up sharing some of those thoughts with her circle, few will be researched further, few will be written down or few will be acted upon. Whereas most of it will remain private in her memory

In few cases, when she chose to act (buy, play, engage, apply, register, write, share, upload, click, research) upon it, she ended up interacting with the environment

She wasn’t entirely aware of the data ecosystem of which she will be more a part of today than she was yesterday

Data flow and activities

The image reflects the current ecosystem of data flow and activities where a user generates data through interaction with the environment around us like websites, search engines, government agencies, retail stores, banks, etc. Data is then collected from these multiple sources, collated, and mapped to build a massive database with PII (Personally identifiable information), behavioral, transactional, demographical information, etc. Which is then sold to companies, law enforcement agencies, and the same person is targeted/threatened/ surveilled. The person interacts again, and the cycle continues. As the data is collected and stored with multiple agents, it is more susceptible to attacks at multiple points in the cycle as evident from attacks on Acxiom (’03), Epsilon (’11), and Equifax (’17). In some cases, the environment agent is the de facto Data collection agent too like Facebook, Google, and Twitter using data to advertise for 3rd parties.

The blue lines show the information flow and usage that seems benign, whereas red lines show the concerns around data protection and privacy

How is the data collected?

Data Sources — WordCloud

Every day, 2.5 quintillion bytes of data created globally. Most of this Data is collected from our interaction with the environment, which includes websites, Browsers, social media, phones, and government agencies like DMV, Marriage registrar, etc., Data is pooled and mapped by agencies like Data brokers to create a holistic dataset with over 1500 unique data points for each person

Who is buying data?

Data buyers — WordCloud

Government, private companies, marketers, buy data expecting to create more value. This data may be used for marketing, surveillance, win elections, screening candidates for Jobs, sanctioning loans, etc. Marketers, particularly, never had access to the monumental volume and variety of data assets available today. In 2018, $12 Bn was spent on acquiring 3rd party data in the US. On the bright side, this data is also used by researchers to advance scientific knowledge, utilized for the betterment of society

What kind of data?

Data kind — WordCloud

A few data types can be directly used to identify a person whereas others can be used in combination with other data for identification. For instance, using a combination of DOB, Gender, and Zipcode, an individual can be identified. According to a study, 87% of Americans are identifiable. The recent California privacy act defines it in terms of probabilistic identification

Data has shaped our modern world. With this tremendous amount of free-flowing data, debates around data protection and privacy are gaining momentum. Enforcement of Rules and Regulations around the collection, access, and usage of data have increasingly become crucial in the last couple of years as many cases have surfaced

Few cases of data misuse and privacy encroachment

PII is exploited to increase the effectiveness of advertising campaigns. Does Custom or tailored audiences ring a bell?

Mass Surveillance by governments is another concern that poses threat to individual rights. Mass surveillance is justified through terrorism and cybercrime.

Identity theft: Data breaches expose the PII (in addition to other data) information of millions of people, leading to identity theft. 30% of identity theft comes from Corporate breaches

Access to behavioral, demographic, and psychographic data of citizens provides a political advantage as evident from Cambridge Analytica — Facebook scandal

A Data broker was fined over $800K by FTC to provide details of job candidates to recruiters to screen candidates

Companies in certain countries are required to store data locally for a longer period, to be used by the government in the name of protection against cyber-crime and terrorism. However, activists and journalists are targeted, undermining freedom of expressions, rights to association, and free assembly

Following the user online and displaying ads for the same product repeatedly is a priming technique, exploiting our cognitive biases

Credit Card Fraud and email spamming are rampant due to hacked/shared information

Location data was misused by a travel company (“God View”) to track the location of Journalists and Celebrities

Discrimination on opportunities based on race, health, education, and income information. For example, Facial Recognition of ethnic minorities (“Uighur Alarm”) in China raised eyebrows across the Globe. Another example is, Price Discrimination on e-commerce sites based on Location and other info

Due to the disparity in power dynamics between an individual and corporations, strong rules and regulations are needed to prevent exploitation from corporations and the government itself. Historically, we have seen that rules and regulations are formed on a case-by-case basis as concerns are raised around the common practice.

Timeline of data misuses and shaping of rules and regulations

Public debate, general awareness, and laws come into effect as a result of people questioning, voicing their opinions on exploitation, or a debacle like a Data breach shakes things up

from the year 1890 to 1999
from the year 2000 to 2020

Hopefully, after Europe’s GDPR and California’s consumer privacy act, other nations will follow soon

What next?

What users can do?

Much of the data is volunteered by the user itself. Few don’t take this seriously and others who are serious, don’t know what to do and feel a lack of control over their data. 6 in 10 US adults do not think it is possible to go on with daily life w/o having their data collected. Users need to be more aware of how their data is collected, stored, and disseminated. Using private browsing mode or extensions like Trackmenot (adds noise to your searches), and search engines like DuckDuckGo helps to an extent

People across the globe need to deeply question their growing lack of data privacy and control. Governments and businesses would feel the pressure as the voice of the people becomes stronger

What companies can do?

Privacy notices that appear on an website or app are too long and impractical for an average person to go through. For instance, to read WhatsApp privacy statements of 6K words, an average person would take 23 minutes. Rarely, anyone would end up reading it, making it an illusion of choice rather than a true one. In addition (or instead of) to huge privacy notice, a user could be notified with a simpler message, For instance, “Did you know this app is using your data right now”

If the ad-based business model should continue, we must discuss the option of users getting paid for their data

Minimize focus on consent and more on stewardship; companies should be held responsible. Even in aggregate targeting, organizations need to ease down on extreme targeting. Plus, instead of an opt-out approach, implement an opt-in model for ad targeting, denying the advantage of status quo bias

Stronger encryption to protect user data from hackers. Organizations need to up their data defenses, categorizing it as a strategic investment rather than an ongoing cost

Human error like weak passwords, phishing of unaware employees are a major cause of Data Hacks. However, companies now have training programs in place to minimize this

What Government can do?

Bring into effect a holistic Privacy and Security law like GDPR (easier said than done, I know)

Try to stay ahead in amending legislation around data protection with tech advancements (now IOT and AI)

Ensure no discrimination on opportunities with stricter fines, and make it easy for people to delete all their online traces of data; “Right to be forgotten

Internationally, we have to agree upon one common definition of privacy across the globe, as information flow and humanity has no boundaries

Why should Emma have to worry about her Data being compromised? Why does she have to go out of her way and take a hundred steps and read ginormous documents to ensure her privacy is protected? Why can’t the process be as simple as a click? Why does she have to live under the fear of being watched?

Let Emma enjoy her day in peace