How your coffee shop could take over the world: Big Data & the Network Effect

The transition into the digital era was marked by a ubiquity of personal computers in all shapes and sizes. From personal computer to mobile phones to smart car consoles, technology intruded into our daily lives faster than we could measure the implications. The practical uses of these devices and their applications are obvious — we watch shows on Netflix, we use social media to connect with friends, and we use Teslas to drive without having to use our hands.

But sometimes the uses aren’t so obvious. Maybe we use social media to post photos, or share our locations with friends, or look at news feeds, or share our opinions, or all at the same time. The point is, modern technologies can be rather amorphous in their functions and purpose. As users, we have little sense of what data is actually being collected and operated on when we’re using technology. Perhaps Facebook is tracking the phrases we use the most in our private messages under the guise of optimizing user experience, or Netflix is analyzing the movies in order to psychologically analyze us. A number of privacy and transparency regulations, such as the Eureopean Union’s General Data Protection Regulation (EUGDPR), have been enacted in order to protect users from these issues. However, as technology evolves, it becomes more and more difficult to identify what data may be compromising and in what manner we are exposing ourselves.

The Rise of Artificial Intelligence

The crux of the issue is the development of advanced machine learning and the rise of artificial intelligence. In general, the AI horror story resembles something out of Terminator:

The reality is that the dangers of AI already pervade our culture. Terminator depicts artificial general intelligence (AGI), which describes an artificially intelligent system that can perform any task that a human can and is a long way away. Scary, but not relevant to our lives quite yet.

The other type of AI is called weak artificial intelligence (also known as narrow AI or specific AI), which is an artificially intelligent system designed to optimize for one specific task, such voice or text recognition. This type triggers less visceral fear — who would be scared of an AI that is simply really good at reading? The name, weak AI, isn’t very intimidating either. Nevertheless, this type of artificial intelligence has already pervaded our everyday lives. It can be beneficial in many cases, such as Google Maps optimizing our routes according to traffic patterns or spam filtering for emails. However, in a very real sense, nightmare scenarios can occur and have occurred.

The Nightmare Scenario

Consider a scenario in which there is a dataset with your basic identity, who you associate with, and what you browse on the Internet. With this information, a weak AI system could learn trends in our networks and analyze our personalities in order for a malicious actor to manipulate us. The immediate question is, how could we allow all of this data to be collected? The unfortunate answer is that it already has been.

In 2014, all it took was a personality survey on Facebook that scraped information from user profiles. Information on profiles, likes, and friends was collected and ultimately was provided to the firm Cambridge Analytica. With the help of AI, the data could be used to manipulate users in any number of ways, such as through targeted election ads.

Technology may collect data in an unpredictable variety of ways, and this data can come in many shapes and forms. With the help of AI systems, collections of data become much more powerful than the sum of their parts. This can be seen as a form of the network effect— large networks of users prove to be exponentially more valuable than smaller networks.

Regulatory acts were passed in response to the Facebook data breach. The line of privacy becomes much more blurred, however, when users give consent to give up their data.

Consumer Consent and the Data-Selling Model

What if Spotify offered a 20% discount in exchange for access to your browsing history? Or your grocery store offered 10% off in exchange for your Facebook profile?

There is little regulation on users giving up their own data with consent. There is little incentive for each user to avoid giving up their data, especially if it is anonymized or not compromising. Due to the increasing value of collections of data as a result of the network effect and modern computing power, a new business model has arisen: firms can incentivize consumers to consent to share their data.

On my campus, there’s a shop named Shiru Cafe. Students love going to the shop for one reason: the coffee is free! This is possible because, upon entry, students swipe their student ID cards, consenting to give up basic information to Shiru. Shiru is sponsored by a number of corporations, who use the data for analyzing their recruiting classes. Another model is MoviePass, which offers discounts to movies in exchange for user data, which it then sells to (attempt to) turn a profit. Both of these companies are not particularly successful yet as they are still in their nascent stages, but have gained major traction and attention in pioneering the data-selling model.

The use of data in these cases is not directly malicious but shows a clear apathy in consumers when asked to give out their data. In models like Shiru and MoviePass, users are incentivized to share their data as it is harmless and valueless to each individual, but companies are incentivized to collect it because aggregated data becomes much more valuable. Companies gain a competitive advantage from data network effects, allowing them to more efficiently utilize capital to improve their products.

The Nightmare Scenario Revisited

In the Facebook data breach, private data was taken without user consent. However, with companies being able to offer direct incentive for users to share their data, we may observe liberal sale of data and rapid, unprecedented accumulation of datasets by corporations.

Generally, all companies can create user profiles and extract broad scale industry trends. For example, Tesla could offer charging station discounts to users in exchange for location data, which allows them to know broad scale trends such as where consumers are driving and when they are driving. However, the accumulation of this data also generates residual use cases if Tesla were to sell or decide to use location data in another way. An advertising firm could optimize the location of their billboards to target specific drivers’ tendencies, or a retail corporation could optimize their store locations by finding out where their target demographic generally drives during shopping hours.

Or, in the case of a bad actor or data breach, a hostile government or organization could plan an attack where peak traffic occurs. This is without a doubt a nightmare scenario, but in this case, it is one that is more difficult to pin on any regulatory failure or corporate negligence. Every user is responsible for having given up their data, even though at a basic level they may not have any direct incentive to not divulge location information.

As it becomes easier and easier to save money by being more liberal with our data and corporations become more capable of processing big data, we will observe a massive ascent of the data economy. Data is already being collected in everything we do. Artificially intelligent systems, through techniques such as deep learning, can find insights and trends about data that we wouldn’t be able to identify or understand otherwise.

It is up to us to measure and predict the implications of the data-economy we’re heading towards so that this time around we can promote technological transparency, privacy, and consumer responsibility. So when your local cafe begins offering discounts in exchange for connecting to your social media, think again about taking that discounted latté and what it might cost.