Surveillance Capitalism: Regulation & Transparency

Surveillance Capitalist Firms: Companies which business model is to extract and accumulate data in order to “predict and modify human behavior as a means to produce revenue and market control.” (See Shoshana Zuboff paper for more details)

The poster children of Surveillance Capitalism are Google and Facebook but many big companies are following that path (Amazon, Uber, traditional Telco companies…) and it has become the defacto business model of many “AI first” startups (accumulate data which is fed to algorithms).

Historically many of these companies have operated in a “It’s better to ask for forgiveness than to ask for permission” mode. They would use borderline ways of extracting data, and later would deal with regulation or user complaints (this is exactly what Google did in Europe with Google Street View, they collected data without permission and later dealt with several lawsuits).

However it seems that we’re at a pivotal moment. The number of abuses reported is increasing, as well as the scale of their impact, and bad press keeps piling up. I personally don’t think that an unethical / borderline behavior is inherent to this business model. I think our society can benefit from the progress brought by data & AI powered products that respect user privacy and are transparent.

In the next months/years the companies operating with this business model will feel the pressure to change their modus operandi. Pressure coming both internally (code of conduct, transparency culture) and externally (new regulation, scrutiny from customers and users). I’ve listed below several areas where this pressure will be felt and what it implies.

Transparency Culture / Code of Ethics

If there’s one lesson to be drawn from the recent Facebook / Cambridge Analytica fiasco, it’s how opaque the system and the communication are. It’s really unclear where our data is going and how it is used. I believe that Surveillance Capitalist firms will increasingly feel the pressure to be more transparent and to communicate better with their users.

Transparent and easy to read Terms and Privacy Policies. We’re to a point where “AI that reads privacy policies so that you don’t have to” products are developed (that’s insane when you think about it). These companies should keep their Terms of Service easy to understand for humans. I also believe that the “Rating Agency” model will get more important, see Tos;DR as an example.

Code of Conduct / Code of Ethics. In addition to their Terms of Service, I wouldn’t be surprise to see more companies thinking of a Code of Ethics and sharing it with their users. As Aaron Levie tweeted: “We’re in the very early stages of a major shift in software. As more of the world goes digital, the responsibility of tech companies grows exponentially. The days of arguing that (and acting like) tech companies are merely platforms and pipes are behind us.”

More choices for the users. Many services only offer a binary choice to their users: “You can use our product only if you accept our TOS and you can’t say anything about what we do with your data. If you don’t accept that, just don’t use our product.” If there’s enough pressure coming from users, I think that these service providers will be forced to offer a more granular control over what we share.

Open & transparent communication. Obviously a more transparent communication is much needed from these firms. They are doing a great job at communicating about their technical and downtime problems, they will probably have to do the same with user privacy changes.

Privacy Compatible Tech

There’s a lot of things to be done on the technical side to make these companies more accountable and also to improve privacy without sacrificing the efficiency of data/AI powered products.

Decentralized model. One of the major problems when dealing with user data is the traditional centralized model: a service will extract data directly from its users and host it on its servers. However alternative approaches and protocols focused on privacy and anonymity exist (see Anonize based on zero-knowledge proof or Federated Learning which is collaborative machine learning without centralized training data, by Google). I’m personally very excited by this trend.

Result Traceability. In many cases AI products are black boxes. The problem is worse with Deep Learning tools as even the people who build them are often unable to explain how a prediction was produced. Being able to explain how a prediction was made is important because it can help fix biased results. I believe that the field of “AI auditing” will grow in the next years (see this).

Algorithm Testing. Algorithms are not neutral. The predictions produced are influenced by the people who build them and by the data they get trained with (this is called Mathwashing). Some people want to be able to “test” algorithms with different types of datasets (for example through an API) to see what results are produced. It might not be necessary for music prediction algorithms, but could be useful for tools used for public services for example.

Data Traceability. When users consent to share data with a service, they should be able to trace where it goes and how it’s used. From Gabriel Weinberg “Presumably every Facebook API request is logged and those logs exist indefinitely. Therefore, it should be possible to go back and see how many times friends of friends data was downloaded and by whom, and further how many companies and who exactly scraped millions of records.”

Legal Environment

Regulation.The elephant in the room. Several States have started to rollout new laws specifically designed to protect user data (see GDPR in Europe). We can definitely expect more regulation on that side in the years to come. An analogy very often made is the one with oil. Many people think that data should be treated as a resource, like oil, and should even be nationalized”.

Data auditing. Even non-public companies have to share regularly their financial status with the administration. For the big private companies finance audits are even compulsory. I don’t think it’s far fetched to think that such audits will also be required for companies extracting a lot of user data. Similar to a finance audit they’ll need to precisely explain what they do with the data they collect (and as with financial audits it doesn’t need to be shared publicly, it’s mainly for compliance reasons).

Users pressure. As people education is getting better with these issues, I expect more pressure coming directly from the users (class actions, pressure groups…).

Final words

It would be naive to think that all these requirements/actions will solve every problem and transform the big Surveillance Capitalist firms within the next couple of months. To give a more granular answer I want to distinguish the Surveillance Capitalist firms which have reached hyperscale (Google, Facebook, Amazon…) from the smaller companies and startups which business model is also to extract and accumulate data in order to “predict and modify human behavior. They will be impacted differently by this new environment.

Hyperscale Companies

The big question is whether these companies believe that we’re just in a temporary situation — basically that we are in the “ask for forgiveness” phase but in some months everyone will have forgotten about it so they’ll continue to operate the way they used to — or if this pressure will be strong enough to force them to change.

Another aspect to be taken into account is that, as I’ve explained, these companies got used to operate in the “better ask for forgiveness than for permission” / “move fast break things” mode. They have an history of explaying publicly that they will be more careful with user privacy, but everytime they tend to break things again. Changing such company culture is extremely hard. Their business model (advertisement mostly) makes it also harder for them to change completely this direction.

That being said I don’t see how they could not be impacted by the pressure coming from the States and from the users. For sure they won’t be able to continue to operate the way they used to in the past years. It has to be seen how they’ll exactly embrace these changes.

Smaller Companies / Startups

I think that the impact for the smaller companies / startups using this business model will be different. Since they don’t have the scale of Google & Facebook I believe that they’ll need to adopt a more ethical behavior at the risk of being put out of business:

  • Users have more power as they are generally less “locked-in” by these services. See how Moviepass stopped their creepy location tracking feature within days.
  • They’ll also need to comply with new regulations, such as GDRP, but they have less resources to accomplish that compared to a Google or FB. They must be compliant from the start and build companies respecting user privacy from the ground up.
  • I also believe that adopting a transparent company culture, when it comes to user privacy, is increasingly becoming a must have for emerging startups. The “move fast and break things” moto won’t be as popular when it comes to the user privacy and ethics topics.

This is why I’m still optimistic and that I don’t think this business model is inherently “evil”. I hope a new generation of privacy friendly products will emerge without sacrificing what they can bring in terms of technology.