Is Cambridge Analytica a Big Story? Part I
There’s a lot of bad information floating around right now about Cambridge Analytica. Unfortunately, this week shows that political journalists don’t appear to read tech news. Not only has most of this information been published by respected media outlets since 2015, but reporters are also making mistakes in how they describe Facebook and other technology issues. Those distinctions are important because they mean the difference between breaking U.S. law or practicing unethical but permissible business strategies. To be clear, there was no data breach, but Facebook’s business model allowed them to exploit profile data.
As I researched this topic and started writing what I knew,* it quickly grew beyond the reasonable length for one blog post. Four themes emerged:
- Part 1: Cambridge Analytica & the press
- Part 2: What prompted this drama?
- Part 3: Cambridge Analytica’s exaggerations about their work & fact-checking the media
- Part 4: What does Facebook do with the data & why does it matter?
Part 1: Cambridge Analytica & the Press
Cambridge Analytica is old news. And Facebook knew about it since 2015.
The story didn’t break this weekend. It’s been out in the open for a long time. This is why the many people in the tech community, on both sides of the aisle, are nonplussed. The Guardian’s “whistle blower” story didn’t reveal any new information. It merely confirmed what they and other publications such as Vice, The Intercept, Das Magazin and political sites such as Politico and National Review wrote about between 2015 to 2017.
It all started in 2015 when The Guardian profiled Ted Cruz about his campaign prospects in the 2016 election. He mentioned that they were using the same data-driven techniques that helped Obama reach persuadable voters in 2012 but didn’t elaborate. Days later, The Guardian ran an expose on how Cambridge Analytica worked and raised alarms about the ethical and privacy issues that have consumed the media since Facebook announced they were suspending the company last Friday:
Documents seen by the Guardian have uncovered longstanding ethical and privacy issues about the way academics hoovered up personal data by accessing a vast set of US Facebook profiles, in order to build sophisticated models of users’ personalities without their knowledge.
Facebook responded to the article after it was published on December 11, 2015 that all data had been deleted (this is central to the current saga):
After this article was published, Facebook said the company was “carefully investigating this situation” regarding the Cruz campaign.
“[M]isleading people or misusing their information is a direct violation of our policies and we will take swift action against companies that do, including banning those companies from Facebook and requiring them to destroy all improperly collected data,” a Facebook spokesman said in a statement to the Guardian.
Cambridge Analytica was the subject of other media investigations.
If the 2015 exposé from The Guardian was the only investigation into Cambridge Analytica’s practices, the reaction of the media (and Facebook) might be justified. However, there were numerous stories by respected news outlets that covered the work of Cambridge Analytica and Facebook’s role in data mining.
November 2015: Bloomberg
While this is a white-washed portrayal of Cambridge Analytica’s history, it remains one of the most detailed profiles of the company and its work prior to the Cruz and Trump campaigns.
Linking to The Guardian’s expose, this article focuses on the laws surrounding the collection and appending of data about voters and cites both the 2008 and 2012 Obama campaigns and the 2016 Cruz campaign.
February 2016: NPR
All Things Considered delved into the Cruz campaign’s use of micro-targeting and Cambridge Analytica.
August 2016: Tablet Magazine
Tablet, worried about the anti-semitism prevalent in 2016 election, uncovered the mysterious origins of Cambridge Analytics’ parent company, SCL Group, Ltd.
October 2016: Channel 4
The UK channel sent a reporter to investigate how campaigns in America collected data on voters and interviewed Cambridge Analytica’s Alexander Nix, and Jim Messina from the 2012 Obama campaign.
November 2016: Bloomberg TV
In the immediate aftermath of the election, this Bloomberg clip admited that Hillary ran a terrible campaign (something that the press is more and more reluctant to say) and examined how Cambridge Analytica claimed early success.
December 2016
The Irish Times: This brief article summarized how Cambridge Analytica worked and the theories behind how targeted messages were sent to voters based on their outlook and presumptions.
PBS: NovaNext examined how Facebook’s algorithms and ability to data mine are eroding US democracy. Cambridge Analytica was discussed at length.
January 2017: Das Magazin & Vice’s Motherboard
In December 2016, Das Magazin, a German publication, published a lengthy investigation of Cambridge Analytica. The following month it was translated and published by Vice. The article features Michal Kosinski, the Polish researcher who helped design the MyPersonality app at the Cambridge University’s Psychometrics Centre. It also detailsed how Aleksandr Kogan later formed Global Science Research. There he copied the MyPersonality app and hired people through Amazon’s Mechanical Turk to complete the assessment and provide access to Facebook data for SCL.
[If you are catching up on this story, the Vice article is a great place to start.]
March 2017: The Intercept
The Intercept article also focused on how Cambridge Analytica/SCL obtained the Facebook data, but it revealed two important points:
- Facebook went on the record and stated that they believed all data was deleted by SCL in 2016. This fact is the central reason they later suspend SCL on March 16, 2018.
- Facebook employed Joseph Chancellor, Aleksandr Kogan’s main collaborator and former co-owner of Global Science Research.
December 2017: Mother Jones
In this article, Mother Jones profiled David Carroll, a media professor at New York’s Parson School of Design. Carroll and Paul-Olivier Dehaye, the co-founder of PersonalData.io took advantage of a British law that allowed them to request Carroll’s data file from Cambridge Analytica. Because the data was processed in the UK, it falls under the 1998 British Data Protection Act. In April, Carroll and several other unnamed Americans retained a British solicitor to sue Cambridge Analytica over their failure to obtain “explicit consent” to gather personal information including “political opinions.”
Cambridge Analytica was open about their tactics.
Despite the use of adjectives like “secretive,” “shadowy,” and “mysterious,” in the press Cambridge Analytica spent at least two years on a huge sales pitch in America that included details on how they operated. For the entire year after the election, they gave keynote addresses at marketing and technology conferences outside of the U.S. about their work.
If an 11-minute video is too long, there was also a 2-minute slick promotional video that explained how they operated.
Even after Nix and Cambridge Analytica fell out of favor in the US (see Part II), he was still promoting their work. At the Online Marketing Rockstars event, Nix gave a 30-minute keynote address on March 10, 2017.
Later, Molly Schweickert from Cambridge Analytica gave a 40-minute keynote at d3con in Germany on May 12, 2017.
Cambridge Analytica received an astonishing amount of coverage explaining their tactics.
For a story that seems to have shocked most of the press, Cambridge Analytica received an astonishing amount of coverage over the years. In 2015, Sasha Issenberg wrote a glowing review of the company’s profiling capabilities for Bloomberg that goes into details about how they harvested information and were founded:
Of all the microtargeting profiles of myself I had seen, none had flattered my self-concept like this one. Its predictions already seemed more plausible than those of the Democratic data warehouse…”
In November 2015, Yahoo! grasped what has shocked! reporters over the past few days:
Whether we like it or not, political campaigns know more and more about each and every one of us, and they’re using that data to craft increasingly specific advertising tailored to our lifestyles. Republicans, led by Karl Rove, pioneered the technique of political microtargeting in a presidential election in 2004, to get out the vote for George W. Bush. But Barack Obama’s campaign perfected the strategy in 2008 and 2012, with Republicans falling behind in their microtargeting prowess. Now Cambridge Analytica and other firms serving primarily Republican clients are trying to catch up. Nix says his database has between 4,000 and 5,000 data points on every registered voter in the U.S. — from where you shop to websites you’ve visited, cars you’ve driven, magazines you’ve subscribed to and your all-important voter registration history. This allows campaigns to better target Americans with TV or online ads, direct mail, texts and robo-calls.
In April 2016, Wired named Alexander Nix, the CEO of Cambridge Analytica, in their “25 Geniuses Who are Actually Making the Future Happen Now” list. In August 2016, TechRepublic summarized their business model:
Cambridge buys, cleans, and normalizes consumer and corporate data from large vendors, then amalgamates the external sets with their own massive in-house data stack, “enriched,” as Nix explained, with social media information and various “acquired” data sets. The final result is an uber-list of over 220 million North American consumer records.
When Cambridge Analytica was hired in 2016 by the Trump campaign, the news was featured in Business Insider, Politico, Daily Beast, and National Review.
Where did the media go wrong?
Regardless of where you fall on the political spectrum, there is no reason to be surprised by the headlines about Cambridge Analytica over the past few days. Very few political firms receive this much international attention. Many of the alarming headlines came from publications that had investigated the company in previous years. Furthermore, even more coverage was devoted to the similar methods used by the Obama campaign in 2012.
As technology and politics become entwined, it is critical for political reporters to have a solid grasp on the technology they write about. The initial headline in The Guardian article this weekend described this as a “data breach.”
It is not. According to TechTarget, a data breach is:
A data breach is a confirmed incident in which sensitive, confidential or otherwise protected data has been accessed and/or disclosed in an unauthorized fashion. Data breaches may involve personal health information (PHI), personally identifiable information (PII), trade secrets or intellectual property.
Not only is it unprofessional to use imprecise terms, but the use of improper technological terms and descriptions can set off a panic. Seeing “data breach” in a headline automatically compares the situation to Equifax, Target, Home Depot, and now Orbitz.
This situation was entirely different. The issue here is whether SCL/Cambridge deleted the data they legally obtained when it was discovered that they violated Facebook’s policies on APIs.
When reporters are writing about technology, it is imperative that they ensure the correct words and explanations are used. Anyone with a grasp of knowledge of the Cambridge Analytical situation could have told The Guardian, the New York Times and other newspapers of record that they were wrong in the categorization.
In Part II, learn what started this situation.
****************
*How do I know about Cambridge Analytica?
I’ve worked in political tech since 2006. Most of the time, I‘m happily far behind the scenes. I’ve worked on a few innovative projects. Some you might have heard about, and many you probably didn’t. Fascinated by the potential that psychographic targeting offered for grassroots organizing, I’ve followed Cambridge Analytica since they came to Washington in 2015.