Listen to this story
Listening to most of the analysis of Cambridge Analytica’s use of Facebook data, one would think that our deepest, darkest secrets were pilfered from Facebook’s servers and hand-delivered to Trump Tower and the Kremlin, which skillfully used them to exploit our fears and manipulate our emotions. One could be forgiven for thinking this when Cambridge waged a brilliant marketing campaign to convince you that their mastery of the dark arts — honed through waging information war in the third world — was the secret ingredient behind Donald Trump’s shocking victory.
Further revelations that Cambridge was not completely on the straight and narrow in how it handled data has added fuel to the conflagration over social media, fake news, and the Russian influence campaign. This firestorm has put political data collection practices in the crosshairs of Robert Mueller’s investigation, a British parliamentary inquiry, and government regulators in the US and the European Union.
As with many stories associated with the trigger words “Trump,” “Russia,” and “fake news”, the story is so bound up in the political passions of the moment that it’s difficult to discern what’s real and what isn’t. So what exactly was Cambridge Analytica trying to do?
By collecting data through an academic front, Cambridge Analytica violated Facebook rules. This has been described in terms of a “data breach” — but it wasn’t in the traditional sense. Countless developers accessed the same data under the rules — which were changed in 2014 and sunset in 2015 — including the Obama 2012 campaign. The data itself was vast in scale, but its contents were hardly earth shattering, consisting of a list of names, hometowns, and the pages a user had liked, which most people on Facebook are public about.
The origins of this controversy can be traced back to Facebook’s launch of an application programming interface (or API) for app developers in 2008. With this platform, Facebook aimed to “lock in” users by organizing its users social lives not just on Facebook, but across a range of popular apps and services. To do this, Facebook granted developers liberal access to its user data so they could build social features in their own apps. Rather than creating an account and then requiring users to painstakingly re-friend everyone on Spotify, or Pinterest, or Instagram, these services sped adoption by letting you quickly import your existing social graph from Facebook. If your friends weren’t yet on these sites, Facebook gave developers your friend list so you could invite them.
Along with your friend list, the API would return other information about your friends — specifically, the other pages they liked. This was done to help you find things you liked and friends with similar interests. In one API call, you could get an astonishing amount of data if you wanted it — your friend list and all the pages they liked, which amounted to potentially tens of thousands of data points per app authorization. Popular apps were able to recreate Facebook’s “social graph” — showing who was friends with whom — as well as the so-called “interest graph”, showing who liked what, and how these likes related to one another. This might let you see how one’s choice of liquor correlated with their choice of political candidate, for instance.
Analytics companies which had spent millions of dollars to know exactly this — through massive surveys and purchasing consumer files from data brokers like Acxiom — now had free access to a comparable source of data. And for those who knew how to use it, Facebook data was potentially superior to the offline data. Pages that you had publicly liked were a high quality signal of evolving preferences and behaviors, while data about offline purchases contained in consumer databases were so sanitized as to be virtually useless. It was this opportunity — tied to its ideas about psychographic targeting — that Cambridge was looking to capitalize on.
It wasn’t just technology companies getting in on the Facebook data game, but political campaigns. When Barack Obama announced for re-election in April of 2011, he did so with an unobtrusive Facebook app that asked simply “Are you in?” Authorizing the app allowed you to register yourself with the campaign with one click of a button. In doing so, the campaign also received your friend list.
It wasn’t clear what the campaign would do with that list until late in the campaign season when it unveiled the latest iteration of their Facebook app, known as targeted sharing. The app matched your Facebook friend list with voter files in battleground states. One day, out of the blue, Obama supporters were receiving emails with the names and faces of their Facebook friends — asking them to tell their friends to vote. More than five million people were contacted through the app, but the Obama campaign likely had a list of Facebook users numbering in the tens, if not hundreds, of millions.
The Facebook app was the subject of one of the few embargoed post-election exclusives the Obama campaign gave detailing its technological and analytical achievements, with one official calling it “the most significant piece of technology developed for this campaign.”
In the fawning media coverage of the Obama campaign’s technological prowess, it did not occur to observers at the time to call this a startling invasion of privacy. And it wasn’t, or at a very minimum, the privacy risks were arguably outweighed by the benefits. A tool like this could be the future of politics: door-to-door canvassing for the digital age, and a welcome antidote to impersonal broadcast TV ads or a welcome upgrade from getting a phone call from a stranger telling you to vote.
The conversation we had about analytics data following the 2012 campaign — recognizing the privacy tradeoffs but also the potential advances that might come from a more surgical and personalized approach to campaigning — is a far cry from the hysteria that reigns today. Today’s conversation reflects an anxiety that populist forces, specifically Donald Trump, have grown better at harnessing technology and social media. That exposes the purveyors of this technology, Facebook chief among them, to scrutiny and regulatory risk that didn’t exist when when the tools were in the hands of people in line with the sensibilities of the media and political establishment. The idea that technology companies might let a candidate like Obama but not Trump get away with borderline privacy violations isn’t hyperbole. It was essentially confirmed by the Obama campaign’s product manager on the targeted sharing app.
Aspects of Cambridge’s use of the Facebook data — not to mention the growing revelations about the rest of its business — are troubling. It’s unclear exactly how the data was used, but we know two things: the Trump campaign was not among its users, and the end product Cambridge was using the dataset to build, personality-based targeting, has been universally and spectacularly panned by a range of ex-Cambridge clients.
This could mean that while Facebook’s data might be able to tell us what car you’ll buy or which candidate you’ll vote for, it still can’t divine your personality or tell your secrets.