The Thing Most People Don’t Realize About Data
Facebook was recently ordered to stop collecting data on Whatsapp users in Germany.
As someone who learned how to code fairly late in life, I feel like I straddle the line between mainstream consumer of products and understander of how they actually work. Sometimes it’s disturbing (knowing “how the sausage is made” — so to speak). And I’ve learned a lot about data in the process that I think the average person is unaware of.
So when I hear about most people’s beliefs about the data that they generate, and the kind of privacy they expect or feel like they have a right to, I’ve started to realize most people are totally unaware of how this data is created in the first place. So I’m writing this essay as a sort of layman’s guide to data, that way at least we can be informed about what’s going on and then form a position from a place of thoughtfulness rather than fear.
We generate data everywhere we go — it’s impossible to prevent
Think about a physical store for a second.
I once had my backpack stolen in a Chipotle. It had my laptop and some textbooks in it. Over $1000 worth of stuff.
I had to file a police report and make a claim to insurance and all that. At some point I realized Chipotle has video cameras in every store! They probably have video of the guy or girl that took my backpack, and that should make it much easier to track them down and get my stuff back.
It turns out, most stores only record their cash registers for exactly this reason. They want to be able to track down robbers who steal money from the business itself, but they don’t really want to get involved in every theft between customers, so they don’t always record what’s generally going on in the store. That way if the police comes asking for tapes, they don’t have to waste all this time complying, they can just say, “Sorry, we don’t have a camera there.” So I was shit out of luck in this case.
But the point is, most of us accept that when we walk out of our own homes and into a restaurant or someone else’s store, we implicitly accept that we may be recorded, or that data may be tracked about us.
We think it’s wrong for someone to look into our window, but as soon as we leave the protection of our front door, it’s not necessarily wrong, or at least it’s not illegal, for someone to look at you. You’re forfeiting some of your protections.
In the case of walking into a physical store and being recorded, we accept it as part of the “terms of use” of the store. If you walk onto someone else’s property, they have the right to collect information about you in order to protect themselves.
Of course, they don’t have carte blanche to collect everything — it would be fucked up if they did an infrared scan of every customer that walked in to see what they’re carrying on them — but we’re generally okay with the idea of being on tape. So there’s a line here, but the line has more to do with social conventions, what we’re comfortable with, and what’s socially prevalent. It’s obviously different on a country-by-country basis.
The thing is that when it comes to websites or apps, our brain tricks us into thinking that because we’re going to them from the comfort of our own homes, we have the same right to privacy as we would if someone came and looked through our windows. Unfortunately, that’s not the case.
Opening up a website or downloading and opening an app is the same thing as walking outside of your own home and going into someone else’s store or business. In a very literal sense, you’re using your computer as a portal to connect to someone else’s server. That they paid for. And accessing code and files that they wrote. And getting data from their database.
And they also track your behavior on their website or app.
Now there’s many reasons why a website or app would track your behavior, or collect information about you. It’s not all evil and about advertising or manipulating you — there are perfectly legitimate reasons why it would make sense or even be necessary to collect this data.
Did you know that in order to log into a website, the website has to put a cookie on your computer? Did you know that that’s essentially what it means to “log in”? They put a little cookie with a session id on your computer, and that’s how the website knows who you are and that you are logged in.
There’s no other way to do this, but we take for granted the ability to use a website easily, not realizing that a fundamental part of how the technology work involves tracking you. (Okay, like everything here, I’m simplifying. There are other ways to do this but the one I’m describing is by far the most common and prevalent.)
You can, of course, use protections like connecting to website through proxies, or using certain forms of encryption, but you’re not really preventing data from being created about you, you’re kind of just changing the nature of the data and making it harder to associate with you as an individual. It’s definitely still possible though.
All websites and apps also keep log files of what’s going on. They do this partly so that they can track errors, fix the errors, and provide you with a better experience. If they didn’t do this, every website or app you use would be constantly crashing, and that would be a horrible user experience. (Think about the 90s or early 2000s where these products were notoriously finicky and breaking all the time — I think we can all agree no one wants that.)
Of course, I’m not defending all forms of collection and usage of data. I’m willing to accept that in some, and possibly most, cases our data is being tracked and used in really bad ways. We obviously need to protect against bad practices like saving our sensitive information (such as passwords or credit card info) in a way that makes it really easy for a hacker to get. That’s where rules like PCI compliance come from that have to do with how you’re allowed to store credit card data.
But I do think it’s important that people stop being naive to the fact that when you’re visiting a website or opening an app, you’re no longer in the comfort and protection of your own home. To a certain extent, you have to play by the rules of the business you’re walking into, or you have the option to not walk into their store.
How “Protectionist” Data Policies Could Kill Innovation
So going back to the Facebook story. My fear is that the lawmakers who are creating regulations about how companies can and can’t use data have so little understanding of data itself and how it’s created, that they will end up creating very “protectionist” policies.
I’m not saying that in some cases these laws and policies aren’t justified, but I am saying that we need to be careful. The more we demand of companies in terms of how they can and can’t use data, the more restrictions we create that affect the day-to-day of how companies create products, the harder and more onerous it may become for them to innovate.
The same law that might prevent a technology giant like Facebook from serving you personalized ads may also unintentionally make it basically impossible for a new startup to do something truly innovative.
At One Month (my education company), I’ve seen first hand how the same laws and policies that were initially created to protect consumers from predatory for-profit universities with shady business tactics have made it incredibly difficult for legitimate smaller companies to even enter into the space without huge amounts of wasted time and money. (Did I ever tell you about that time we were forced to put together a “Fire Escape Plan” for our online school because all the laws were created with in-person institutions in mind?)
The goal, I think, should be for people (and especially lawmakers) to have enough understanding of the technology that they’re using every day to know the implications of their behavior, and to be able to make their own informed decisions about what they do or don’t want to do — and therefore share with the world. Otherwise our behavior will continue to be guided by fear rather than thoughtfulness. By ignorance rather than knowledge.