Answering the Three Big Questions Surrounding Cambridge Analytica

6 min readMar 21, 2018

The Facebook outrage machine threw some gears this weekend when the New York Times reported on accusations of databuse on the social media platform. In a story that has been making the rounds, the authors suggest Facebook allowed one of Trump’s campaign advisors, Cambridge Analytica, to “exploit the private social media activity of a huge swath of the American electorate, developing techniques that underpinned its work on President Trump’s campaign in 2016.”

There’s a lot of context missing when it is noted that “the firm harvested private information from the Facebook profiles of more than 50 million users without their permission, according to former Cambridge employees, associates and documents, making it one of the largest data leaks in the social network’s history.”

First off, the technique used to collect information was well known if you worked with Facebook’s API. If you created an app before V1.0 of the API was shut down, you could get demographic data about a user’s friends, regardless of the kind of application. Permission wasn’t needed to collect demographic data, it was freely granted.

At the time, this method was heralded as revolutionary when the Obama campaign used it to build their own personality profiles. By the end of 2012, Obama For America boasted nearly 1 million users on Facebook, which meant that “our supporters/fans on Facebook were friends with roughly 98% of the population of the United States of America.” The campaign used that reach to target friends of friends with a nudge. Here is how the Washington Post described it in 2013:

[The Obama team] knew from other research that people who pay less attention to politics are more likely to listen to a message from a friend than from someone in the campaign. The team could supply people with information about their friends based on data it had independently gathered. The campaign knew who was and who wasn’t registered to vote. It knew who had a low propensity to vote. It knew who was solid for Obama and who needed more persuasion — and a gentle or not-so-gentle nudge to vote. Instead of asking someone to send a message to all of his or her Facebook friends, the campaign could present a handpicked list of the three or four or five people it believed would most benefit from personal encouragement.

By 2014, what the Obama campaign had pioneered has been deployed in apps from the Republican National Committee, the Democratic National Committee, and Americans for Prosperity. If anything, these get-out-the-vote apps were likely far more effective than the microtrageting of Cambridge Analytica because they enlisted the help of users to influence their friends.

Interestingly, Facebook conducted their own study on various techniques, comparing simple informational messages to personalized messages from friends like the Obama campaign used. The difference in voting between those who received the social message and those who received an informational message was 0.39 percent. In a little shared line from that report, the researchers call into question the effectiveness of informational message when they explain, “turnout among those who received the informational message was identical to turnout among those in the control group.” In other words, the report suggests that some forms of advertising are probably ineffective.

But the big difference between the Obama For America and Cambridge Analytica lay in transparency. The company acknowledged that it had acquired Facebook data through Aleksandr Kogan, an academic who was working on psychographic modeling. In response to questioning about its use of Facebook data, CA said it had deleted the information as soon as it learned of the problem two years ago. That brings us to the second major issue with the NYT piece.

Was it a data breach? It doesn’t seem so.

As the Times noted, Cambridge paid to acquire the personal information through an outside researcher who, Facebook says, claimed to be collecting it for academic purposes. Had they simply collected the data themselves, they wouldn’t have had a problem. So, Facebook itself wasn’t breached, but the researcher did seem to pass on the demographic information to Cambridge, which is a breach of the Facebook Platform Policy.

There is no federal law defining data breach, but the state-based breach notification laws tend to follow the same form. A breach requires that an unauthorized acquisition of data has occurred, and that the data is covered. This is where the term personally identifiable information comes in, as it is the general term for sensitive information. Facebook isn’t trucking in the kind of data that is covered under law, like social security numbers, health data or financial data. Arguably, Facebook could get caught up in Illinois’ expansive biometric definition since, but still, sensitive data as defined by law doesn’t seem to have been shared here.

Lastly, we should talk about the issue of effectiveness. Did the company really have some secret sauce that it unleashed upon the 2016 electorate?

From almost every serious person I know in this business, the answer is no.

It is now being reported that Cambridge Analytica’s data was phased out of use within the Trump campaign by September 2016 when it became clear that the RNC was going to put its backing behind the campaign. Throughout the summer, wealthy donors and leaders within the Republican party were calling for the RNC to stay out of the election and that put the Trump team in a bind. The Trump campaign had tested the RNC data, and found it be far more accurate than what Cambridge Analytica had to offer. But Cambridge Analytica was a hedge in case the RNC wouldn’t share its data. So when an agreement was reached between the RNC and the Trump campaign, they dumped CA.

From the people I have talked to, CA data was known to be costly and unproven, which put it at a disadvantage in the competitive landscape of political data. Ad Age ran a report in the middle of the 2016 election season summarizing what I and countless others know:

As the firm sinks its sharp sales chops into New York, it leaves behind a mixed reputation in Washington, D.C. where several Republican strategists who have worked with or met with Cambridge in the past year see the company as a curiosity, an intellectually-advanced interloper that never really “got” American politics. Sources say the company bit off more than it could chew and failed to deliver some of the technology and analytics services it sold or meet crushing election-season deadlines.

Both Brad Parscale, the director of digital for Trump, and Cambridge’s chief product officer Matt Oczkowski have said time and again that they didn’t use psychographic targeting. “The RNC was the voter file of record for the campaign, but we were the intelligence on top of the voter file,” Oczkowski said last year. “Sometimes the sales pitch can be a bit inflated, and I think people can misconstrue that.”

In a piece from last year, the NYT also suggested that CA data had been dumped at some point in the campaign, and went further to detail the problems with the firm’s claims:

But a dozen Republican consultants and former Trump campaign aides, along with current and former Cambridge employees, say the company’s ability to exploit personality profiles — “our secret sauce,” Mr. Nix once called it — is exaggerated.
Cambridge executives now concede that the company never used psychographics in the Trump campaign. The technology — prominently featured in the firm’s sales materials and in media reports that cast Cambridge as a master of the dark campaign arts — remains unproved, according to former employees and Republicans familiar with the firm’s work.

But let’s set aside all of the reports suggesting CA was ineffective. And outside researchers who question the techniques of “psychographic” models. While we are at it, let’s set aside the vast literature in political ad effectiveness that finds their influence rapidly decays, and research finding that ads are often overlooked. Let’s also set aside every third party effect study where it is found that people believe mass media messages have a greater effect on others than them.

What is left is a subtext that isn’t all that different from the worries about fake news. I can almost hear the conversation: Those people — conservatives, right wingers, libertarians — they simply didn’t vote correctly. But we cannot blame them. They were unduly influenced by fake news and by mind altering advertisements. If only they saw the correct right kind of information. If only the right kind of advertisements were served, then Trump wouldn’t be President. I cannot help but think this story is some kind of salvation narrative. And with all good salvation stories, someone has to get crucified.

Answering the Three Big Questions Surrounding Cambridge Analytica

Written by Will Rinehart