Big Data is People!

Daniel Appelquist
5 min readJul 23, 2015

Note: This post was originally published on the Future Insights blog.

Big data has become a well-worn buzzword of the technology industry. We’ve been hearing about the benefits to big business from owning, processing, visualizing and basing decisions on big data.

But where does all this data come from? It doesn’t just spring into existence. The inconvenient truth is that this data is produced by people, by us, often without our knowledge or meaningful consent. In a real sense, big data is people.

As you walk down the street, your location is being tracked by your mobile phone network operator. This is not conspiracy theory — it’s how your network operator accomplishes the business of connecting an incoming call to you when someone dials your number, or delivers you a text message. Knowing your location is necessary in order to provide you phone service and nobody should be surprised that this is happening.

In personal data parlance this is what’s called a “primary use.” You implicitly allow your mobile network operator to know where you are so you can receive texts and calls. However, network operators are now using this data in another way — a secondary use. They are aggregating the information, homogenizing it and anonymizing it and then selling it wholesale to companies and organizations that use it to make decisions. This “big data” revenue stream is seen as one of the key growth areas for mobile operators across Europe (where “ARPU” — average revenue per user — has largely flatlined).

But did anyone ever ask mobile phone users for their permission to use their data in this way?

Likewise, as you traverse the web, you are being tracked without your knowledge or meaningful consent. The sound you’re hearing is the collective shrug of disinterest from most of humanity about this common practice. True, most people are more than happy to surrender their web comings and goings to advertising tracking networks. One web search for washing machines can lead to a host of white goods following you around the web from site to site. This practice is tolerated by most users and seen as harmless.

In return, you as a user get to use the Internet’s “free” products: Google, Facebook, Twitter and the like. The phrase “if you don’t pay for it, you are the product” has been often repeated in this context, but there’s a nuance that this misses. Yes, you are paying for Google by giving them information about yourself and by viewing (and sometimes clicking on) ads that they choose to target to you — the primary use of the collected data. But that information doesn’t go away once those ads are viewed. It’s retained, homogenized with other data from millions of other users, mined in perpetuity for insights and analytics and sold on in bulk with the purpose of either making more money or saving costs. It is the gift that keeps on giving. Social media innovators like Facebook and Twitter have taken the exploitation of big data to an art form.

The term “big data” reflects “Big Oil” and that’s apt, because there is a rush on right now to survey, map out, extract, refine and exploit data. Big data is big business and the potential returns for business are extraordinary.

As reported in the New York Times last month, talks sponsored by the U.S. federal government on facial recognition recently broke down. At issue was whether “people should be able to walk down a public street without fear that companies they’ve never heard of are tracking their every movement — and identifying them by name…” Privacy-focused groups such as the Center for Democracy and Technology, the ACLU and the Electronic Frontier Foundation walked out of these discussions on this key issue.

Commercial interests are constantly pushing the boundary of acceptable data collection and use. I witnessed the collateral damage caused by similar action the advertising industry took to gut the nascent do-not-track web standard in the World Wide Web Consortium. Work on that standard started as a good-faith effort to allow people to say no to web advertising tracking networks. Fast-forward five years and what do we have: mostly due to obstructionist, blocking and delaying tactics — and sometimes outright hostility — on the part of the advertising industry we have no agreed W3C standard, no legislation bound to its use and no advertising networks agreeing to honor it.

The 2002 film Minority Report envisioned a world where ubiquitous facial recognition would enable ubiquitous, interruptive, personalized marketing at every turn. This technology also enabled a police state by allowing state security services to track every living person as they moved about their daily business. Marketers and police alike salivate about the possibilities that this kind of power would provide them.

But big data collectors don’t need advanced facial recognition to make this dream a reality. They can already turn to readily available technologies to get the job done.

It was recently reported that London busses are to be fitted with so-called iBeacons in a trial to “push targeted offers” (adverts) to commuters. Of course, in order to push these (no doubt wonderful) offers to you, these beacons are also collecting data about you. Make no mistake: the advance of beacons is about more surreptitious data collection. Beacons are the fracking of big data. By injecting yet more ads into our everyday lives, companies hope to extract yet more big data (without your knowledge or meaningful consent) they can exploit for more profits.

Put away your tinfoil hats. I am not suggesting we should disengage with social media and the web, compost our smartphones and crawl back into caves. Obviously, there are innumerable advantages to all of these services. I’m a user of services like Facebook and Google. Not all uses of “big data” are cash grabs either. Google’s “flu trends” famously uses data extracted from (geolocated) users’ searches to accurately measure the spread of influenza.

Nevertheless, it’s clear we desperately need a rethink about privacy, big data, primary vs. secondary use, and transparency of how this data is collected and used. As reported in the Guardian last month, new European-wide data protection legislation is currently being drafted. This is certainly a step in the right direction but more attention needs to be given to the topics of big data and meaningful consent. As we increasingly move into the world of data, we need to understand that the data we generate is becoming a currency — a currency we are currently giving away. We need to start demanding an accounting of how our data is being used and what we’re getting back in return.

--

--

Daniel Appelquist

Open web curmudgeon; Open Source & Open Standards strategic boffin; co-chair of W3C TAG; Immigrant, Londoner & World Citizen; https://torgo.com