Some follow up on my Cambridge Analytica/Facebook piece

Alexander Nix, ex-CEO of Cambridge Analytica, By SAM_7378, CC BY 2.0

The feedback I’ve received from publishing my article on the Cambridge Analytica story and the issues I had with the coverage has overall been overwhelmingly positive. When you write something, you usually hope that some people will read it, and on this occasion many did. I’m glad that a lot of people found something useful and informative in the piece and I still fundamentally agree with what I wrote.

However, I’m just as susceptible as everyone else to negativity bias, and so it is the critical responses, more than the positive ones, that I tended to notice. I was not expecting the piece to receive so much attention and there are things with hindsight — including specific wording — I would probably change. I am also not sure if the Star Wars and Jesus GIFs would have made it in. Then again there is currently a 1-to-1 ratio between my article featuring a Mark Hamill gif and it being widely spread.

Sorry, I couldn’t resist.

I’m writing this piece to clarify my positions and respond to the criticisms that I saw raised most often. In part then I will be repeating some of the same things I have already said in the comments under the original article, so I apologise in advance for anyone reading me make the same point twice.

For ease of reference, I’m paraphrasing the common criticisms I saw and adding my response to them underneath.

1: The mainstream coverage was not that bad. I saw lots of nuanced articles that did not exaggerate the story like you claimed.

There was good coverage out there. In fact, part of my original motivation for writing the piece was the discrepancy between what I was seeing in mainstream articles and those written by tech journalists, who typically got most of the details right. That’s why I added the ‘(almost)’ in the title. There also was some decent coverage on the Guardian and NYT too (I referenced quotes from this article, for example) but these still typically did a poor job of contextualising the issues and linked to misleading articles. How many people recognised from such coverage that the data Cambridge Analytica collected was easily reproducible by tens of thousands of other companies who also used the friends permissions feature? From my reading, and the feedback I received from the previous article, it certainly did not seem to be the majority.

If you are someone who is involved in tech or someone who just does a lot of research and has cultivated reliable sources, then you likely will have come across a lot of good coverage and ignored (or avoided) the sensationalism. My piece was not arguing that it was impossible to find good coverage. My argument was that a general audience—who would be likely to have read only a couple of mainstream articles and heard some pundits discussing the story — would come away with a misleading impression.

Here I invite anyone who is still skeptical to conduct their own experiment to test whether I am right that the coverage has been misleading or I am wrong and it has provided most people with an accurate understanding. Ask someone you know who is not technically savvy or super invested in the topic to summarise the story and see what they say. I’ve tried this multiple times and invariably the summary I received is similar to what I paraphrased at the start of the previous article.

To illustrate what I consider ‘good coverage’ here are three articles that I think get things right.

  1. The Big Data Panic by Felix Simon on Medium.
  2. The Noisy Fallacies of Psychographic Targeting by Antonio Garcia Martinez at Wired.
  3. Cloak and Data: The Real Story Behind Cambridge Analytica’s Rise and Fall by Andy Kroll at MotherJones.

2: No-one called it a ‘hack’.

Source: Guardian article: ‘I made Steve Bannon’s psychological warfare tool’: meet the data war whistleblower.

Above is a screenshot from one of the first major pieces published by Carole Cadwalladr, the journalist who broke the story. This is where I first saw the term and I subsequently heard it mentioned on various political podcasts. The fact that it is so misleading is what made it stick with me but given the subsequent attention the word ‘hack’ received, I regret even mentioning it.

If I was rewriting the original article I would replace all instances of the word ‘hack’ with the word ‘breach’, which was much more widely used and is equally as misleading. My underlying point, that Kogan made use of a feature provided by Facebook, not a bug or some ingenius workaround, remains the same regardless.

Some people have suggested that using the term breach/hack is appropriate if you take it to mean ‘using something in a way that is not intended’ but I would argue that is not how most people understand the term in the context of this story. When you talk about breaching and hacking in a story about Facebook data it gives the impression that Facebook’s security/data protection systems were circumvented. But they were not. Instead, the data was collected through a feature that Facebook provided to developers. Kogan’s collecting of 50 million profiles from 270,000 permissions was not a ‘hack’, it was just him making using of the Facebook Graph API.

The sharing of the data to Cambridge Analytica by Kogan was ‘in breach’ of Facebook’s data sharing policies but that is a very different thing from the company’s security systems being hacked/breached.

I understand that technically informed people already understood this distinction but I would argue that the majority of the public consuming the story would not and that the use of terms like breach and hack is almost guaranteed to mislead people.

3: You are being an apologist for Facebook.

This is a response to the piece that has been quite surprising to me because I had thought that the original piece made it clear that I thought: “ The real story then is … [that Kogan] used methods that were common place and permitted by Facebook prior to 2015” and Facebook seems to have been altogether too cavalier with permitting developers to access its users’ private data.”

This seems to have been taken by some to mean that I endorse, or see no problem, with how Facebook treated user data. But that is precisely the opposite of what I meant.

To clarify: I think that Facebook’s *feature* that enabled developers to access information about a user’s friends, without their explicit consent, was a major problem. A problem that is being drowned out in coverage due to the amount of focus being poured on Cambridge Analytica and the ‘breach’. The most important breach in this story is Facebook’s breach of its users’ trust by not securing their data better. This is also an entirely separate issue from whether Cambridge Analytica could actually exploit the data as claimed. Even though I believe there is no good evidence that Cambridge Analytica was as effective as it claims, I still do not think that they — or any other company — should have been able to so easily access the data of so many users without their permission.

What I do give Facebook some credit for is having removed this feature two years ago. Their actions since then have been far from perfect but it at least demonstrates that they are willing to change their policies in the face of pressure and protest.

I was asked by a journalist in the wake of the article who I thought was to ‘blame’ for the breach of privacy. My answer was that there was plenty of blame to go around. Facebook abused users’ trust that their data would be kept secure, Kogan and Cambridge analytica broke Facebook’s data sharing terms and sought to exploit the data they collected in ways that were not permitted, and Facebook users — then and now — did not take much care over their personal privacy, granting permissions with little consideration for what it signified.

But despite this smorgasboard of potential blame I do not think that the blame is evenly distributed between all groups. Cambridge Analytica was deliberately trying to exploit the data for pretty nefarious purposes. Kogan was putting profit, or at least a desire to access massive datasets, above academic ethics. And Facebook knows how users treat security; they know most people do not read the finer details of permissions and they could have made the ability to share data to friend’s apps be an option that was opt in, rather than opt out.

So in summary, I do think we need to be concerned about how our data is being used by social media companies (and with attempts to manipulate us with targeted messaging) but I think that better understanding the relevant facts helps us to better identify where the real issues and problems reside and how best to address them.

4: Pro-Brexit/Pro-Trump people are using your piece to support their fake news narrative and dismiss the importance of the story.

That’s a shame. But just because some pro-Trump/pro-Brexit folk like the piece for partisan motivations does not make the piece itself partisan. On this occasion, the misleading coverage is mainly coming from left wing sources and part of that seems to stem from a desire to explain away (and potentially invalidate) the undesirable election outcomes as the result of some shadowy psychological manipulation by an evil corporation.

The truth however seems much more mundane but equally depressing. Both the Brexit referendum and the US 2016 election were (marginally) won by right wing populist movements doing what such movements have always done: whipping up xenophobia, offering simplistic solutions to complex problems, and claiming that they will destroy the corrupt system that has been holding ‘the people’ down. I reiterate my view that things like Vote Leave’s well publicised lies about the EU and NHS funding and Trump’s endless barrage of tweets, are likely to have had a much bigger impact on the outcome than anything Cambridge Analaytica delivered.

Whether you agree or disagree with my political views is also immaterial to the argument I am making because people on the left and right (or wherever) should still care about actual facts, rather than focusing on what they want to be true or what is politically convenient. I know this is the era of fake news but that does not mean we have to accept that as the new standard.

5: You cannot infer Cambridge Analytica was ineffective just because Cruz lost.

I only referred to the Cruz example as an illustration that Cambridge Analytica were not as powerful as widely presented and that it is misleading to focus on the hits and ignore the misses. Cambridge Analytica, indeed, did claim the failed Cruz campaign as a hit. Alexander Nix explains during his Concordia Summit presentation that their methods made him the second most viable candidate in a crowded field.

But again there are plenty of reasons to be sceptical of these self-promotional claims. Here, I would again recommend people read the MotherJones article I linked to earlier which offers an in-depth look at how effective/ineffective Cambridge Analytica were judged to be by past clients.

Below are some representative extracts:

A PAC, the Middle Resolution, had paid Nix’s company several hundred thousand dollars that year for a list of persuadable voters to help elect Republican Ken Cuccinelli, who was running for governor. Months passed, and the list never arrived. When the group’s founder, Bob Bailie, demanded the list, Nix asked for more money and Bailie cut bait. Another Virginia-based group, Americans for Limited Government, then paid SCL $100,000 to create a list of suburban female voters who traditionally supported Democrats but might be swayed to vote for Cuccinelli if shown the right message. Late in the race, the group’s canvassers took Nix’s list into the field and returned with a perplexing result: The people on it were already Cuccinelli supporters. The higher-ups at Americans for Limited Government asked another firm to analyze the list. It turned out SCL had handed them a roster of die-hard Republicans.
Cambridge Analytica’s reputation for spotty work had circulated widely among Democratic and Republican operatives, who were also put off by Nix’s grandstanding and self-promotion. Mark Jablonowski, a partner at the firm DSPolitical, told me that there was “basically a de facto blacklist” of the firm and “a consensus Cambridge Analytica had overhyped their supposed accomplishments.”

I am not arguing that there is no possibility that Cambridge Analytica had any impact on any results, rather I am arguing that there is no evidence that they had a bigger impact than any other political targeting campaigns.

6: So you think Cambridge Analytica did nothing wrong?

This is somewhat covered in point 3 above but just to underline the point… No. I am not saying Cambridge Analytica are innocent.

At very least they intentionally broke Facebook’s data sharing policies but there have also been many revelations since the story broke that paint them as a deeply immoral company. Channel 4’s undercover filming caught Cambridge Analytica’s executives, including the CEO Alexander Nix, boasting to a potential client about their willingness to use and experience in various underhand techniques to win election campaigns, such as hiring prostitutes to collect compromising evidence on opponents or promoting fake news.

Regardless of whether the claims were true or just boasting. The fact that this is how Cambridge Analytica promoted it services reveals a lot about the company. Including, crucially that their much vaunted psychological targeting does not seem to be their main sales pitch. This is a paradox that somewhat undercuts the accounts that they could control minds and win elections with just a few clicks of a button.

As Alexander Nix himself stated in the undercover footage: “It sounds a dreadful thing to say, but these are things that don’t necessarily need to be true as long as they’re believed.”

This sentiment is precisely why people should be sceptical of uncritically accepting Cambridge Analytica’s accounts of how effective it was.

7: Facebook profile data is Facebook’s internal data.

In the original post I emphasised that it was important to distinguish between Facebook’s internal data and the data that third party developers collected on Facebook using the Graph API.

Some people objected that the information the developers collected like profile demographics, likes, location logins, and other material constituted internal data that Facebook provided to developers.

I agree with this and I was actually quite surprised to learn just how much information that the developers could collect, including accessing a users’ private messages! See the table below and this article about the Graph API by Jonathan Albright for further information.

Source: Symeonidis, Tsormpatzoudi & Preneel (2017)

However, I still maintain that it is important to distinguish between the portion of user data that the Graph API granted developers access to and the much greater mountain of user information that Facebook holds on all users. For an idea of what that looks like in comparison I suggest to read any of the recent articles (like this one from Wired) in which journalists describe what they found after they downloaded their data from Facebook.

8: You unfairly maligned Carole Cadwalladr’s coverage.

I focused on Carole Cadwalladr’s coverage because she was the journalist who broke the story, her coverage was widely cited, and it reflected most of the issues I had with how the story was presented.

Since the event I have not had much call to revise my opinion. I respect the work she has done and the role she has played in bringing very important issues about data privacy, social network, and online political targeting to public attention. However, her coverage continues to display all of the things I previously complained about— a conspiracy minded sensationalism, uncritical fawning over Chris Wylie’s narrative, and little concern with precise technical details.

Consider for instance, this recent tweet:

You might assume from this tweet that what she sharing is a contract between Facebook and Cambridge Analytica, but it isn’t.

The first few lines of the document make this clear. Under the heading ‘Parties’ there are two companies listed, neither of which is Facebook.

This is a contract between Kogan’s company, Global Science Research, and SCL Elections, the parent company of Cambridge Analytica. Misrepresenting details like this matters because it provides fodder for those seeking to dismiss the story for political reasons.

From the interviews I heard with her Cadwalladr dismisses arguments about Cambridge Analytica likely exaggerating their abilities, pointing out that they have been paid lots of money for their services, including by the military. This is not exactly a slam dunk defence given that political campaigns often waste large amounts of money on ineffective strategies and how often militaries have invested in things like remote viewing programs and dowsing based bomb detectors.

The end…

I hope this post has clarified some of my positions. If you think I’ve missed some important argument or am misrepresenting any of the criticisms then let me know in the comments below.

You can find me talking about similar stuff on twitter: @C_Kavanagh