You could have seen this coming

In the days following Trump’s victory, many of my friends (especially the well-read sort of people who enjoy The New Yorker) expressed shock at the results. The mainstream media also appears stunned. News networks like CNN were predictably hyperbolic, but even the more staid New York Times described the win as “one of the most stunning political upsets in the nation’s history”. I argue that this was actually not particularly surprising — not because of race, or gender, or populism, or any of the other numerous postmortems proliferating in the past week. Rather, polling data — the best predictor for elections we have — made a Trump win appear extremely plausible.

A common scapegoat has been polling and the failure of “big data” analytics (one of the most misused terms today). This narrative claims that elitist data-obsessed eggheads like Nate Silver totally missed the groundswell of populist support for Donald Trump, and were in just as much of a bubble as the rest of the mass media. For example, popular New York Daily News writer Shaun King asserts that “smug pundits like Nate Silver had no idea in the world what they were talking about regarding Donald Trump” (completely ignoring that Silver has spent much of his time arguing that Trump had a decent chance and people should take him seriously). To the contrary, I will argue that the election was actually quite predictable and reasonably predicted, but the people reading and interpreting the political reporting failed to successfully understand what they were looking at. This is a sobering tale of the challenges of more mathematical models in the popular press — if journalists or the general public can’t understand your model or struggle with statistics, even “correct” models can mislead.

How could the polls be so wrong???

They weren’t.

OK, to expand on that further, polling is hardly infallible. In fact, polling is now facing a crisis. The decline of landline ownership and the increase in non-response rate to polls has made it much harder to get a representative sample to poll from. This means that an element of estimation is central to modern polling. Some polls, like the much-maligned Los Angeles Times poll were radically transparent in how they tried to adjust to the new world of polling. Most polls were not. This meant there was a serious possibility of hidden methodological error in the polls, outside of just the standard mathematical variance.

That said, the polls were still pretty accurate. On a national level, they were more accurate than 2012, and on a state level comparable. They did have a serious miss in many swing states, but even there underestimated Trump’s support by just over 3% — hardly a disastrous miss. The polls showed a close race and had known problems. Why were people so stunned when Clinton lost?

Pundits and many data scientists were wildly overconfident

There has been a proliferation of models attempting to predict the presidential election. All of the major ones projected Clinton winning, which was a logical projection as virtually every poll showed her narrowly winning as well. However, the confidence many of the models had in Clinton’s victory was astonishing. Most egregious was the Princeton Electoral Consortium model run by neuroscientist Sam Wang. This model projected Clinton’s odds at ninety nine percent! Wang was extremely confident in the power of his predictive model. When Nate Silver’s website, 538, was purchased by ESPN Wang commented on how he felt Silver wasn’t doing anything special with his statistical analytics, saying “In my view, prediction is not hard either, except for the step of identifying true predictive factors.” In the case of predicting elections, polling is the most predictive information we have. This implies that predicting elections is a relatively trivial task.

When pressed on how his confidence in his 2016 prediction could be so high, Wang responded with an in-depth look at the math behind his model. While the math might be reasonable, it relies on numerous assumptions. When making assumptions, implicitly setting the probability of those assumptions at 100% is a recipe for disaster.

This is the central topic of The Black Swan by Nassim Nicholas Taleb, where he talks about how predictions can go wildly awry. Taleb rose to prominence after his book was seen as prescient regarding the financial crisis. In the banking crisis, as in some election polling, increasingly sophisticated mathematical models gave the illusion of accuracy. One must always remember that non-mathematical error exists. If you assume it does not, your model will be wildly overconfident, and you will spend the day after wondering how your prediction was so wrong.

I’m using Sam Wang as a punching bag here since he was the most egregiously overconfident, but most pundits mistook mathematical precision for predictive power. The two concepts are obviously related, but it’s worth noting that in the more reasonable 538 model, even a 10-point average national polling lead for Clinton (which would be forecasting the biggest blowout since Reagan won 49 states) still wouldn’t have her hit 99% odds. Every model in the social sciences must make assumptions and it is perilous to ignore the assumptions being made.

Basically, any prediction in the social sciences which has 99% confidence should be treated with extreme precision. 538’s prediction was that Trump had about a 30% chance to win. Hillary had a narrow lead in a lot of swing states, especially in the Midwest and there were a lot of undecided voters. If polls were off or those voters broke for Trump (neither of which were crazy outcomes) he would win.

Was this a “Black Swan” election?

Some have started referring to Trump’s electoral victory as a “Black Swan Event.” I would argue that characterization is incorrect. But first, what is a black swan event? A black swan event is a low probability event that is challenging to predict and is rationalized after the fact. These often surprise people when models are misapplied. There are several reasons one could occur:

First, an event may simply be very improbable and assumed to be impossible until observed. For example, say that only one in a million swans are black. You might think that “all swans are white” — until you see a black swan and realize that your assumption is incorrect. So maybe it’s not that a demagogue like Trump can’t get elected, but merely that it was an extremely low probability event that happened to happen this year (I am dubious of this argument).

Second, a black swan event can occur when you are in a brand new situation and old ideas are no longer applicable. I think this is what happened during the Republican primary. The giant number of candidates and the abnormal discombobulation of the Republican establishment created a previously unseen environment that was hard to predict. Models that made assumptions based on previous, extremely different nominating contests, failed completely. For example, while the 538 models easily predicted the relatively normal democratic primary, they were skeptical of Trump even when polls showed him dominant. This primary was unlike any previous one, and the model that was designed based on previous primaries would therefore be inadequate for prediction (I was certainly guilty of this confusion as well).

The black swan view of the general election would hold that either Trump was just extremely lucky and won despite low odds or that the election was fundamentally different from previous ones due to some factor (Trump’s unusual rhetoric, the unpopularity of the candidates, the fact that he was running against a woman, etc.) The problem with these arguments are that models (at least reasonable ones) did give trump pretty decent odds to win.

Wait, so if Trump had decent odds, why were pundits so sure he would lose?

Because they’re pundits. Predicting the future is very challenging for humans, but for some reason we think we’re great at it. One of my favorite quotes from the Daniel Kahneman’s excellent Thinking Fast and Slow describes this cognitive dissonance perfectly: “ The idea that the future is unpredictable is undermined every day by the ease with which the past is explained.”

One problem is that expertise in a subject doesn’t help one predict the future. In a brilliant study by Philip Tetlock, he showed that political predictions by experts were barely better than guesses and worse than even simple algorithms. Fame and media presence were actually negatively correlated with accuracy. How can this be? Tetlock argues that most people are not good at objective data analysis. Humans aren’t good at pure data-driven thinking. We prefer to think in narratives. Experts, with vast stores of knowledge, are good at squashing even incongruous facts into their existing narrative rather than scrapping a narrative they may have built up over years or even decades.

For example, upon seeing data suggesting Clinton had a decent chance at losing the election Huffington Post’s Washington bureau chief Ryan Grim mounted an elaborate (and completely bullshit) counterargument, rather than letting his narrative be challenged. Silver in response asserted that his method had been shown to work best empirically, while Grim didn’t “actually give a shit about evidence and proof.” That might sound harsh, but it strikes to the root of the problem of punditry: When pundits are wrong, they are rarely called out on it and rarely attempt to learn from their mistakes (to his credit, Grim apologized). But the lure of narratives over data remains convincing.

Consider: In the days after the election I saw people write about how Trump’s win “proved”:

  • That liberal elites were out of touch and failed to realize the wave of populism sweeping America
  • That liberal elites were out of touch and failed to realize the mass frustration with political correctness
  • That liberal elites were out of touch and failed to realize how upset people were about Hillary’s corruption as demonstrated by wikileaks
  • That liberal elites were out of touch and failed to realize the dramatic overreach of the democratic party in enacting an over-ambitious agenda
  • That liberal elites were out of touch and failed to realize just how racist/sexist much of America is.
  • A wave of other opinions talking about the Comey investigation, or the lapsing of Voting Rights Act protections, or numerous other factors that may or may not have mattered.

The key thread of these opinions is that they were largely written by people who already subscribed to the narrative they were proposing. The people railing about how Hillary’s corruption handed the victory to Trump were the ones who were constantly citing wikileaks dumps. The people claiming that oppressive political correctness did her in were the ones who had been yelling about how oppressive they found contemporary political correctness. They just slotted the most recent data into their existing narrative, whether it fit or not.

Far more ridiculous were most mainstream pundits who said Trump had no chance, then once he won wrote about what this victory showed. These guys had just been shown to be completely wrong about the election! Yet instead of reconsidering how they were developing their ideas, articles came out about offering numerous new narratives about why Clinton lost. Instead of being humble and willing to attempt to figure out what happened, these pundits discarded their discredited narrative and immediately adopted a new one. The failure of the old narrative didn’t appear to teach the importance of questioning a narrative before adopting it!

How can I avoid being fooled again?

Trust models, data, and occasionally people who have strong empirical predictive track records. Everything else is mostly bullshit. Look at what has been predictive in the past, and ignore what hasn’t (for example, predictions from cable news guests, or New York Times Op-Ed articles). Be willing to question whether your model fits a given situation. Ultimately, the golden rule is “trust your data”. Just ask the man behind the Los Angeles Times poll that was repeatedly attacked from the left because it showed Trump winning: “When you look at pundits and their predictions, the correlation is zero. You have to trust the numbers… don’t get distracted by all the things you think about plausibility.”

Trump is a president with little to no track record and who has so far acted unusually. Be skeptical of those who are confident they know what is going to happen during his presidency, and be very careful about dismissing possibilities entirely — especially if you don’t want them to transpire. You might be falling into the trap of so many pundits, friends, and family members who didn’t want Trump to win and blinded themselves to data that showed his victory was a very real possibility.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.