2013 saw one of the world’s most publicized data breaches. A Target employee inadvertently clicked on a phishing email; soon after, the personal details of over a third of all American adults were released into the wild — to be used in ways that were certainly never intended when those adults gave the data to Target.
Target got slammed for it. People stopped shopping there that Christmas. Their stock price tanked. Their CEO got fired.
As bad as this was, it is absolutely child’s play compared to the racket that Facebook was running at exactly the same time.
There’s a very good reason that Facebook is the fastest growing advertising business in the world. It’s because it has the largest, most detailed and most granular user data on the planet. It’s also incredibly personal, and will reveal a lot about your life to anyone who has access to it. And yet all the way up until April 2015, Facebook was giving all that data away to its developers that were using the Graph API.
If you were one of these developers, and you got a user to give you access to their Facebook account (say, to log in or to use your app), you got a data payout that is unlikely ever to be replicated in history. It wasn’t just the user’s data — but all the data of that user’s friends on Facebook.
Back in 2014, the median number of friends a Facebook user had was 200. It’s been widely reported that Cambridge Analytica only needed to get 270,000 users in order to snare a total of ~50M people’s data. That would suggest about a 1 to 185 ratio (which is in the ballpark of that 200 median friends number).
Now, assuming that ratio is linear (it almost certainly isn’t due to some users being highly connected, others much less so; there are likely to be “pockets” in countries and languages, etc — but it’s helpful as an assumption to do the math) how many unique users would you as a developer need to get, to get every Facebook user on the platform?
1.5 billion users, at a 1:185 ratio = 8 million downloads. That’s right — 8 million downloads and you’ve got the entire platform.
But the kicker is, you don’t even need to have achieved that.
Facebook was so kind as to offer up each user’s unique Facebook User_ID when it returned these data requests. This means that all the data from all the different apps, quizzes and games can be immediately and instantly recombined into one massive database… just like Facebook’s!
In other words: one app developer didn’t even need to create the killer app that got 8M downloads to get the profiles of Facebook’s entire userbase. In fact, it’s entirely conceivable that you collected it all from others and used Facebook’s User_ID as the primary key to piece it all together (hence Parakilas’s “black market” comment above).
And let’s not doubt how many versions of this data Facebook handed out. Do you remember all those annoying Facebook game requests you got from friends? They weren’t just annoying notifications; those friends of yours were (inadvertently) giving away your data, too.
To give a sense of how many apps were out there doing this: here’s an AdWeek article back in 2012, quoting Facebook as saying there were 9 million apps and websites integrated with Facebook. And 2012 was three years before Facebook cut off API access to pulling this kind of data.
And to even further compound the problem: a whole bunch of startups were built relying on access to graph data. When Facebook cut off API access… those companies died. This TechCrunch article names but a few: JobFusion, CareerSonar, Jobs With Friends, adzuna Connect.
Just as we worry about nukes going missing when states fail, how many of those companies or websites failed, and then sold or “lost” all that data that had been collected? (And that’s quite obviously putting aside the fact it’s a lot easier to replicate data than it is a nuclear warhead). This is not to accuse any of those companies named above of wrongdoing — I have no knowledge of what happened to the data when they shut down.
But that’s the point: none of us do. Certainly not Facebook, because it had “zero insight” into what was going on.
I’m going to go out on a limb and say that multiple entities have subsequently hit the data equivalent of winning the lottery: they’ve hoovered up the user profiles of something approaching 100% of people on Facebook at the time that Facebook finally turned off v1 of their Graph API, which was back in April 2015.
This equates to roughly 1.5 billion profiles.
And we’re obviously not just talking about your name and your email address. Think about the kind of damage someone with ill-intent could do to you if they had all of this:
Your name. Your location. All your friends. Your family. Your work history. Your schooling. Your birthday. Your checkins. Your events. Your hometown. Your likes, photos. Your relationships. Your religion and politics.
And not just for you, but for one and a half billion other people.
Target’s data breach isn’t even in the ballpark.
At least Target had the decency to attempt to secure their user data from those who wanted to use it in ways that were never intended it when it was given to them. Facebook didn’t even bother. They just gave it away.
But this begs another question: why? Why on earth was Facebook giving away what amounts to the crown jewels for an advertising business: the incredibly valuable user data that allows advertisers to target? If you’re the fastest growing advertising business in the world, it makes no sense.
I don’t believe it was obliviousness to the impact that it might have — although Zuckerberg has demonstrated plenty of that over the years.
Nor do I think it was inept management — though people do forget how strategically inept Facebook was until it was dragged, kicking and screaming, into the mobile era.
The biggest reason?
For the longest period of time, Facebook was an advertising business that dreamed of being something else other than an advertising business. It wanted to be a platform.
It was probably driven in part by the fact that, in tech, advertising is a pretty dirty business. And a platform? That’s the gold standard.
And if those are the grand illusions that you’ve got, it’s not your proprietary data that you view as the secret to your success (which you only need to advertise). Instead, it’s developers, and getting them to build on top of your precious platform.
And so began the great five year Facebook data giveaway to developers:
If you build your apps on our platform, we’ll give you more user data than you could possibly imagine.
And that’s what happened. As Ben Thompson wrote on Stratechery as far back as 2013, Facebook was so focused on being a platform rather than being an advertising business that it almost missed the boat on mobile. The shift to mobile gave Facebook no choice but to abandon its platform pretensions, and effectively saved the company from itself.
And yet, it seems, Facebook hasn’t entirely escaped the ghost of these past missteps. The complaint that advertising is at the root of so many of the problems tech faces today is not without merit. Ironically enough, however, it might be Facebook’s past resistance to its advertising-based business model that speeds the tech industry towards its regulatory reckoning.
It’s easy to look at Target and bemoan an old-world company that was unprepared for the Internet. Facebook, though, shows that the issue is not the Internet, per se, it’s the danger that comes from a company operating where it shouldn’t. For better or worse, Facebook is a killer advertising platform: that they tried not to be resulted in an outcome far worse for users than the biggest security breach in history.