Prediction, Science, and 538

Adam Elkus
Strategies of the Artificial
3 min readMay 8, 2015

By now, the failure of FiveThirtyEight to accurately predict the British elections has been widely dissected. British political scientist and historian Patrick Porter has been tweeting up a storm about it, noting FiveThirtyEight headman Nate Silver and others tried to impose his vision of what voters would do on free agents that could have the capacity for autonomous action. I agree, to a point. But there’s more to it.

It might be observed, as Jay Ulfelder does, that access to comprehensive, reliable, machine-readable data is extremely uneven. And the more complex the model or the interactions it describes, the greater the challenge of verifying it. People that build agent based models understand this quite intimately:

Robert Axtell, a computational social scientist at the Krasnow Institute for Advanced Study at George Mason University in Fairfax, Virginia, and a pioneer of agent-based modelling, argues that there simply aren’t enough accurate data to populate the models. “My personal feeling is that there is a large research programme to be done over the next 20 years, or even 100 years, for building good high-fidelity models of human behaviour and interactions,” he says.

And by all accounts, what happened in Britain not only lacked the reliable data that Silver was accustomed to but also was far more difficult to forecast regardless. And its worth noting that even within the US, using data to understand complex social outcomes is far from straightforward. David Auerbach recently had a great piece on how a much-vaunted study of prejudice amounted to “big anecdata.”

I’ve spent a long time immersed lately in the disciplinary norms of a variety of other disciplines outside the social sciences to think about how to structure my own evolving conception of my research interests. For example, one of my favorite recent books is a compilation on “artificial ethology” — the art of building robots to demonstrate via existence proof what might generate complex forms of animal behavior. That idea — that you might need to have a mechanistic understanding of a system before you predict it — runs very much counter to the current enthusiasm for big data and predictive analytics that FiveThirtyEight embodies.

It’s worth noting that prediction is certainly a component of a mature science, but it also is not the be-all and end-all. Explanation and understanding are valuable, and also far more foundational goals. Prediction is necessary to discipline science and help us adjudicate between competing models — it is far too easy to fit in-sample and then call it a day. However, prediction is not equivalent to science itself. The major lesson of Seeing Like A State is that not all knowledge of value is produced through statistical information or even my own brand of computational modeling. Sometimes knowledge about a desired set of circumstances can only be extracted through painstaking historical research or anthropological investigation. That kind of knowledge production does not scale as easily as statistical analysis or computer models, and it also is hard to train well. But the painstaking data collection needed to actually do good predictive science is also tremendously costly and haphazard in nature.

Much of the world is “illegible,” to use a term from Seeing Like A State. Extracting what is often highly tacit and distributed/fragmented “knowledge” from it is hard. Sometimes we may collect data in the form of statistical observations; other times the information loss that occurs from such a compression process negates the value of the enterprise. Still, its valuable to nonetheless try to increase our own understanding and gradually, over time, adjust our priors. Perhaps another consequence of FiveThirtyEight’s popularity is the conflation of Bayesian science with prediction, which casts an extremely useful tool in a sadly narrow light.

Finally, I’ll also note that from the perspective of a decisionmaker that might need a scientist’s applied analysis, there is a combinatorial explosion of decision problems compared to the scant amount of useful decision-relevant data, knowledge, or understanding. Worth noting before we embark on some venture in which we apply what works in the lab to the real world.

--

--

Adam Elkus
Strategies of the Artificial

PhD student in Computational Social Science. Fellow at New America Foundation (all content my own). Strategy, simulation, agents. Aspiring cyborg scientist.