How can we improve forecasting?

To understand the true nature of uncertainty, we must master the art of forecasting

By Dan Gardner

Follow Dan on Twitter @dgardner

In November 1958, halfway through the sleepy Eisenhower era, Newsweek magazine ran on its cover images from 30 years earlier. Staring out from the page was Charles Lindbergh and the Spirit of St Louis, a grinning flapper dancing the Charleston and a jazz saxophone. “For Americans, 1958 has been a year of nostalgia,” the magazine reported. “Americans do not, ordinarily, look back wistfully to happier times — they are too busy with the future. But in 1958, with its anxieties and uncertainties, the ’20s suddenly have become a Golden Era.”

Those words could just as easily be written today. Plagued with anxieties and uncertainties, we readily cast wistful glances back to simpler times. While our nostalgia is not generally focused on a particular era — Donald Trump, for one, has never identified when he thinks America was last great — if we were asked to identify when life was ‘simpler’ many people would identify the long idyll of the Eisenhower era. In the UK, a new show called Blue Peter was delighting British children, there was peace and rising prosperity, and the most alarming development was Elvis Presley’s thrusting pelvis. Yet, here is a correspondent from that blessed time informing us that Americans were so troubled by “anxieties and uncertainties” that they were pining for the good old days.

This is a recurring theme through history. Dig into the contemporary records of almost any year and you will find people worrying about the future and looking back to more certain times. You will also occasionally find people claiming, in frightened tones, that this moment, unlike all others, is so afflicted by uncertainty that we live in nothing less than an ‘age of uncertainty’, or some variation of that phrase. Former Harvard economist John Kenneth Galbraith wrote a book and BBC series under that title in 1977. Thirty years earlier, poet WH Auden published The Age of Anxiety; the same phrase that New York Times journalist David Brooks used in August 2017 to describe the present era.

The cause of this curious phenomenon is a psychological misperception that I call the ‘uncertainty illusion’. When we look forward into the future and think carefully about how events could unfold, we see an enormous amount of uncertainty. Things could spin off in a vast array of directions, many of them quite unpleasant. The number of shocks and surprises is limited only by imagination. That is not the illusion; it is real. In fact, we tend to underestimate the immense variety of the possible futures arrayed before us.

The illusion begins when we look back for comparison and see far less uncertainty in the past. Typically, that perception is false and is the product of what psychologists call ‘hindsight bias’. Very simply, knowing that something did or did not happen skews our perception of how predictable it was.

We know there was no nuclear war in 1958, which makes it seem much more likely that there would be no nuclear war than it appeared at the time. So when we think of 1958, we think of Elvis Presley and Blue Peter, not fear of mushroom clouds, which makes it seem a blessed time. Were we to look back at history as it was experienced by people at the time, we could be comforted by the fact that humanity has always grappled with profound uncertainty. But instead, we see much less uncertainty in the past, and conclude the uncertainty we face is unusual, even unique, which makes it all the more alarming. That is the uncertainty illusion.

Reality check

Unfortunately, exaggerated perceptions of uncertainty can inflict far worse harms than occasional bouts of nostalgia. They may convince ordinary people and corporate executives to put savings in a pillow rather than invest. If enough of us hunker down in this way, the economy suffers. And it is not only a matter of money; invention, exploration, experimentation and creativity in its manifold forms all require a certain degree of confidence in the future that can be undermined by the uncertainty illusion.

However, current perceptions of high uncertainty are not entirely the result of a cognitive mirage. Uncertainty clearly fluctuates and there are good reasons to think we are in a period of at least heightened uncertainty. One is the current occupant of the world’s most powerful office. Donald Trump is a US president like no other and his impulsiveness and penchant for upsetting the status quo are sending waves of doubt around the world. But far more fundamentally, complexity theory teaches us that when tightly interlocking systems malfunction, they can generate unpredictable cascading effects, as we saw when the global financial system went into meltdown in 2008. Given the growth of information technology and globalisation, and the creation of vast numbers of large, tightly interlocking systems — along with too little consideration of how these systems would cope with what Yale sociologist Charles Perrow dubbed “normal accidents” — there are good reasons to think we face large and growing uncertainties. Combined, these points lead to a simple but critical conclusion: we are right to think harder about uncertainty, but we must be careful to proceed on the basis of rigorous analysis of uncertainty. We must understand its nature, extent and remedies, and not succumb to subjective perceptions. Perhaps that sounds like common sense, but it is far less common than it is sensible.

“We must not succumb to subjective perceptions”

The best tool we have to probe and push back the veil of uncertainty is forecasting. If we know what is coming, we can prepare. But, here again, there are some common misconceptions. First, it is not true, as many often say, that the future cannot be predicted, nor that nothing important can be predicted. Successful prediction is a routine part of our lives that we could not function without. However, the opposite extreme is just as misguided. We know there are inherent limits to our ability to forecast thanks to chaos theory, complexity theory and painful experience, so the pundits who speak with granite certainty about energy, technology or the global economy in the second half of this century are overconfident.

Predictability is not binary, it falls on a continuum and varies radically by subject matter. I am quite comfortable forecasting the number of senior citizens a decade from now but I would need long odds before I bet money on the price of oil in even five years’ time. To reduce uncertainty, we first need to identify what exactly it is we wish to forecast and where it falls on the continuum of predictability. Then we can try to push it along that spectrum, reducing uncertainty as we achieve progress.

Meteorology is the gold standard. Much as we like to joke about weather forecasts, they are actually quite reliable, in most circumstances, 24 or 48 hours ahead. Three and four days in advance, their accuracy declines. Beyond a week, they are not much use. The reason we know how accurate weather forecasts are is the same reason they are good: meteorologists make a huge number of precisely expressed weather forecasts that are checked against outcomes, producing a constant flow of high-quality feedback. Models are continually tested and they are constantly adjusted in light of the testing, which makes them better. This approach is as rare as it is reasonable. Yet, inspect most domains within business, finance, public policy or any other field and you will find that forecasting is both integral to decision-making and is seldom subjected to anything like the analytical rigour it receives in meteorology.

A particularly unfortunate illustration comes from the world of intelligence analysis. Governments spend immense sums on agencies whose job, in large part, is to forecast geopolitical events and thus inform decisions of the utmost importance. How good is that forecasting? What are its limits? Lots of people have opinions but no one really knows because it is not tested for accuracy. As a result, feedback is sporadic and ambiguous. Lacking good feedback, intelligence forecasting is unlikely to be as good as it could be, and it is surely not improving to the greatest possible extent. Even more alarming, researchers have shown that when professionals repeatedly use a skill without receiving clear, prompt, accurate feedback, they may not get better at it, but they do get more confident. Flat-lining skill and growing confidence is a dangerous combination.

Several years ago, Bill Gates observed that,

“you can achieve incredible progress if you set a clear goal and find a measure that will drive progress toward that goal. This may seem basic but it is amazing how often it is not done and how hard it is to get right.”

That is painfully true of forecasting. One promising way out of this dead-end is predictive analytics, which, by their very nature, deliver precise forecasts whose accuracy can be tested and results used to improve performance. Netflix and Amazon measure how good they are at predicting which movies and books you want, and they use the results to get better, with a good degree of success.

But some of the buzz around predictive analytics seems distinctly unrealistic. Relying on historical data, predictive analytics is not likely to be much use in spotting outliers and low-probability or high-impact events. And there is a world of difference between finding a film I like and forecasting the next economic recession or move by the Kremlin. The promise of technology is enormous, but so are the challenges of using that technology to predict a non-linear, chaos-riddled reality. Human forecasters are not about to be rendered entirely obsolete. Even when proven data-driven solutions are at hand, humans will still supervise their design, operation and use, which means humans will still be forecasting, if only at a meta level. Of course, that is itself a forecast, but it is widely accepted by leading researchers in the fields of technology that some suppose will render human forecasters jobless.

The Human factor

An interesting parallel is the famous 1954 observation of American psychologist Paul Meehl that statistical prediction consistently outperforms the subjective judgement of clinicians. Confirmed countless times in the decades since, it is irrefutable evidence that guts should give way to algorithms. But that does not mean humans should mindlessly obey computers. If we want to predict whether someone will go to see a film on a Tuesday night, to use Meehl’s example, data from prior behaviour and good statistical analysis are likely to produce an excellent, reliable algorithm that will do better than any human could. But what if we learn that the person in question broke their leg that morning? The algorithm will not take note and adjust accordingly, but a human would. Of course, in future, the algorithm may be adjusted to learn from this experience, but whether it is a broken leg or something else, the unexpected will always arise. A human with good judgement will always be needed to work out what it means and revise the forecast.

“Knowing what can and cannot be foreseen would be a major accomplishment”

If we cannot expect technology to be a panacea, and human judgement is and will remain an essential component of forecasting, then we need to get serious about testing and improving that judgement. Fortunately, this trail has been blazed by University of Pennsylvania psychologist Philip E Tetlock.

Starting in the late 1980s, Tetlock started to seriously contemplate how expert political judgement could be tested, compared and improved. From this work came what he calls ‘level-playing-field forecasting tournaments’.

The foundation is precision. Forecasters are not asked whether the crisis on the Korean peninsula ‘will worsen’, whether the iPhone will continue to ‘dominate the market’

or whether the president will successfully ‘pass his agenda’. This is the language of punditry, and far too much other forecasting, and it is hopelessly vague. If North Korea shells South Korea but then suggests negotiations, has the crisis ‘worsened’? Some may say yes, others no, but there is no definitive answer. The same is true of the timeframe of the forecast, which is often left vague or implicit. So Tetlock’s tournaments use timeframes that are absolutely clear, for example ‘within six months’ or ‘by the end of the year’. Forecasters, too, must be precise. There is no saying something ‘may’ or ‘could’ happen. Instead, forecasts are numeric, from zero to 100%.

Predictions are also made in abundance, with each person asked to make scores or hundreds of forecasts. This is essential for several reasons. First, it enables the scoring of probability-based forecasts. A single prediction of a 75% chance of something happening is not proved right or wrong whether the thing happens or not. But a large number of such forecasts can be judged: if the forecaster is exactly right, then 75% of the time they say there is a 75% probability something will happen, it will. Large numbers of forecasts allow us to distinguish between luck and skill. Anyone can pick winning lottery ticket numbers once; only those who keep doing so should impress us.

Tetlock’s first implementation of these techniques started in the late 1980s, when he recruited some 280 expert forecasters, such as intelligence analysts, political scientists, journalists, economists and others whose job involved geopolitical forecasting to some degree. A huge array of forecasts was elicited, involving various timeframes and subject matters. After two decades of research, in 2005, Tetlock published the results. The one that made headlines said the average expert was not much better than a dart-throwing chimpanzee. But far more intriguing was the observation that there were two statistically distinguishable groups of experts, one that did worse than random guessing and one that did considerably better. Factors such as education and ideological inclination made no difference. The real differentiator was the style of thinking. The bad forecasters tended to analyse problems with the one big idea they were sure was right, while those with real foresight preferred to look at problems from multiple perspectives. They were also separated by confidence. Bad forecasters were far more likely to say something was ‘impossible’ or ‘certain’, while good forecasters routinely saw the possibility of events unfolding in unexpected ways and maintained a certain intellectual humility. Perhaps not surprisingly, Tetlock’s data also found an inverse correlation between fame and accuracy. The media and the public are not nearly so interested in cautious, complex thinkers as they are in confident blowhards.

Several years after this research was released, the US intelligence community funded a massive research programme inspired by Tetlock’s work. The idea was to have five university-based teams compete to see who could most accurately make the sorts of forecasts that intelligence analysts are routinely tasked with; for example, whether Russia will seize Crimea, whether Greece will default on its loans and how the Chinese economy will do in the third quarter. Tetlock led the Good Judgment Project, an enormous effort that relied on a small army of volunteer forecasters, a total of more than 20,000 over the four years of the programme.

From this, we have learnt that a basic explanation of the rudiments of good thinking and forecasting was shown to improve forecasting accuracy by 10% over the course of a year. More dramatically, an ‘extremizing’ algorithm that aggregated hundreds of forecasts and nudged them further out on the probability scale beat all comers. But perhaps most promising was the discovery of ‘superforecasters’, a small percentage of Tetlock’s volunteers who consistently beat performance benchmarks, prediction markets and even professional analysts with access to classified information. Close inspection of these superforecasters has delivered insight into the habits of mind and analytical styles that deliver the best forecasts.

Perfecting practice

Much more forecasting like this is needed, not least because a steady stream of comparable efforts to foresee events in particular domains will, gradually, produce a clearer sense of how far into the future forecasters in those domains can peer. As in meteorology, knowing what can and cannot be foreseen would be a major accomplishment. Fortunately, some key organisations have taken note. The intelligence community is applying many of the lessons learnt from Tetlock’s research and is funding further investigation. Banks and hedge funds are similarly applying Tetlock’s insights and creating their own, internal forecasting tournaments.

The potential benefits for organisations are enormous. Forecasts may be aggregated to distil collective wisdom, unrecognised stars may be revealed and good forecasters may get better. Forecasting is a skill that can be improved with practice and clear, prompt feedback that enables the forecaster to think about what went right or wrong, why, and how they can do better next time. A forecasting tournament modelled along Tetlock’s lines delivers that feedback. Like a driving range for golfers, or a shooting range for soldiers, or a batting cage for baseball players, it is a facility forecasters need to get better and stay sharp.

But it does not take an organisation and a big, formal effort to get serious about forecasting. By asking precise questions and keeping score, individuals can set forecasting challenges for themselves and learn from their results just as someone in a formal forecasting tournament would. It is just a matter of taking seriously something we all do haphazardly. Prior to the presidential election of 2016, everyone had their own view about Donald Trump’s chances. The same was true of the Brexit vote and countless other events. Merely setting yourself a precise question, making a numeric forecast and recording it — ideally with an explanation of the reasoning behind the forecast — forces you to think harder about what you believe and why, and learn from both successes and failures. And like a diary, this sort of forecasting creates a permanent record of what you really thought and felt. That is a vaccine against the uncertainty illusion, ensuring a clearer perception of the past, which is the first step to a clearer perception of the future.

Dan Gardner is a Senior Fellow at the University of Ottawa’s Graduate School of Public and International Affairs and co-author of ‘Superforecasting’

This article appears in the RSA Journal Issue 2 2017