Is Superforecasting Really Super?

Published in

WRIT340EconSpring2023

8 min readMay 2, 2023

Superforecasting: The Art and Science of Prediction by Philip E. Tetlock and Dan Gardner explores the process of making predictions and how some individuals consistently outperform experts at forecasting future events. These seemingly ordinary individuals are labeled “superforecasters” for their unusually high level of accuracy in the forecasting tournaments organized by Tetlock and Gardner. Motivated by their exceptional performance, the authors set out to identify the key characteristics of “superforecasters,” such as humility, intellectual curiosity, and probabilistic thinking, and provide anecdotes about how these individuals have been able to overcome the common psychological and arithmetic pitfalls of forecasting. Aimed at demonstrating prediction-making as both an art and a science, the book has several limitations that undermine it as a scientific inquiry. Specifically, while Tetlock and Gardner craft an insightful analysis of the characteristics exhibited by “superforecasters,” they fall short of providing a cohesive prediction-making framework rooted in science or establishing the generalizability of the superforecaster traits in the broader population. Furthermore, the book’s rejection of statistical models and its lack of focus on institutional factors limit its overall utility for organizations seeking to improve their forecasting capabilities, highlighting the need for a more comprehensive approach that considers individual and institutional factors.

The authors introduce a prediction framework used by “superforecasters” called the “outside view first, inside view second” method, which emphasizes deprioritizing the uniqueness of specific cases (inside view) and focusing on broad probabilities (outside view). Tetlock and Gardner demonstrate this point through the question, “Will either the French or Swiss inquiries find elevated levels of Polonium in the remains of Yasser Arafat’s body?” Given the long-standing Israeli–Palestinian conflict, the question is controversial and requires forecasters to predict the outcome of a complex and emotionally charged investigation into the cause of the former Palestinian president’s death. According to Tetlock and Gardner, rather than becoming fixated on the details of the case, the “superforecasters” approached the question by first considering the likelihood of a positive outcome based on the track record of similar investigations, which is the outside view, and then applying minor adjustments to fit the Arafat case. However, this framework seems to be at odds with the authors’ recommendation of closely following the news related to the subject being forecasted and frequently revising one’s predictions in a later chapter titled “Supernewsjunkies?” Major media outlets often have biases due to their political affiliations, and political and business news is often sensationalized, packaging information into emotionally charged stories. Since the emotions attached to a specific case are part of the inside view, it is unclear how “superforecasters” can reconcile their commitment to an outside view-focused probabilistic framework with the sensationalized and often biased reports by the news media. Indeed, when predicting a political event, a “superforecaster” named Bill initially made the correct prediction that the ultranationalist former Japanese Prime Minister Shinzo Abe would visit the controversial Yasukuni Shrine in 2013. Later, as Bill followed the news of the US government urging Japan to maintain healthy diplomatic relations with its neighbors, Korea and China, he arrived at a different conclusion, which was that Abe would not visit the war Shrine, a commemoration of those who died in military service, including many who have committed war crimes across East Asia. Tetlock and Gardner’s advice of being a “news junkie” is partly to blame for Bill’s incorrect forecast, as the news overplayed the US influence on Japanese politics. Moreover, the book fails to offer a systematic explanation of how superforecasters weigh competing sources of information. There is no instruction on how to determine the exact weights that Abe’s personal beliefs and his competing political interests as the Prime Minister should hold in predicting the outcome of this case, leaving readers with more questions than answers about how to apply the framework in practice to achieve the same level of accuracy as the “superforecasters.”

Tetlock and Gardner also build an ambiguous case for the use of statistics in the forecasting process, making contradictory claims about incorporating statistical reasoning and models. While the authors identify superior numeracy as a key characteristic shared among the “superforecasters,” they claim that it is not a prerequisite to becoming a good forecaster and that these individuals rarely rely on quantitative models in making their predictions. The authors repeatedly emphasize the importance of probabilistic thinking while dismissing Bayes’ theorem, a fundamental concept in probabilistic reasoning, as too rigid and lacking real-world applicability without further explanation. In fact, Bayes’ theorem can be flexible in practice as it is based on the prior knowledge of conditions related to an event that allows one to update probabilities based on new information, including the news. Tetlock and Gardner make the oxymoronic claim that the “superforecasters” have a “Bayesian spirit” but do not explicitly follow the theorem or crunch the numbers. Since the authors do not elaborate on how one could possess a “Bayesian spirit” without using the formula, it can be difficult for readers to fully understand and apply the concept in their forecasts without additional guidance.

Furthermore, it seems that Tetlock and Gardner have underestimated the capabilities of certain statistical and economic models. For instance, in the chapter titled “Superquants?” Tetlock and Gardner dismiss the use of economic models, claiming that they fail to account for the phenomenon that people assign more weight to certainty than uncertainty when making predictions. However, this claim is not entirely true. Behavioral economics models are a powerful tool used to predict how people will behave in situations based on psychology, and many take into account factors such as risk aversion and cognitive biases. In fact, several probability models in behavioral economics explicitly explain the phenomenon of people favoring certainty. One such model is outlined in the paper “Prospect Theory: An Analysis of Decision under Risk” by Daniel Kahneman and Amos Tversky, which proposes that individuals overweight small probabilities and underweight large probabilities (1977). The prospect theory has been used to explain people’s behavior in a variety of contexts, from lotteries to investments. Another example is the paper “Uncertainty Effect” by economist Uri Gneezy. It suggests that people consistently value risky prospects lower than their worst possible outcome, an irrational yet repeated discounting behavior (2004). These models demonstrate that behavioral economists have long recognized the importance of accounting for people’s preference for certainty and have developed models that accommodate such behaviors. Models alike are powerful in their capability to predict investment decisions, labor market outcomes, and policy implications, raising questions about whether the “superforecasters” would obtain higher accuracy if they considered such models in the forecasting process.

In the age of data science and machine learning, not applying statistical models seems like a missed opportunity, as data scientists leverage superior processing power and the vast amount of available data to make predictions in a timely manner and with high levels of accuracy. Machine learning algorithms, ranging from simple linear regressions and classifiers to complex random forests and neural networks, can process data from different sources of various kinds, including numerical, textual, audio, and even visual, to make predictions about events ranging from stock market trends to political elections. These algorithms, powered by high-speed computers and vast amounts of data, are a great complement to human judgment. For instance, while it may be challenging for the human brain to process thousands of pages of textual information, such as product reviews, social media posts, and news articles, it is relatively simple to conduct a sentiment analysis using natural language processing algorithms on the aggregated data set and uncover patterns that humans may not be able to easily identify. Furthermore, Tetlock and Gardner repeatedly emphasize removing emotions and biases from forecasting. Algorithms, by design, are not biased by personal experiences and opinions. Moreover, Tetlock and Gardner spend an entire chapter titled “Keeping Score,” discussing the importance of learning from past mistakes to improve predictive accuracy as an essential superforecaster characteristic. As machine learning algorithms are trained on more and more data, their accuracy increases over time, making them the ideal “superforecaster.”

Lastly, the book draws conclusions based on a select group of individuals who participated in Tetlock’s forecasting tournaments in the Good Judgement Project and performed exceptionally well compared to other participants. This is problematic for two reasons. First, the group who voluntarily participated in geopolitical forecasting tournaments for leisure is not representative of the whole population, and their traits may not be generalizable. Without generalizability, the authors’ advice of imitating “superforecasters” to become better at making predictions may not be effective for every reader. It is also unclear whether the characteristics that led to success in tournaments focused on geopolitical events would translate into other domains, such as finance and public health, as the authors advocate for the application of the “superforecasting” method in various disciplines. Second, one can raise the question of whether these individuals emerged as “superforecasters” due to their skills or simply luck since there will always be the lucky few given a large enough participant pool (think lottery winners). Suppose that the participants’ performance is normally distributed. Most people’s performance would be centered around the 50th percentile, while a small portion of participants — those who perform exceptionally well — would fall near the 99th percentile. One could argue that this is due to the forecasters’ skills and then conduct a thorough investigation of the personality traits they have in common that supposedly contributed to their success, which is precisely what Tetlock and Gardner have done. On the contrary, it is reasonable to suspect that luck has played a role and that the “superforecasters” are simply good guessers. The shared traits among the “superforecasters” may merely correlate with forecasting success and do not directly contribute to one’s forecasting abilities. While the book offers a well-organized summary of the traits and behaviors exhibited by successful forecasters, it falls short of establishing external validity (generalizability) and internal validity (causality).

Despite being an enjoyable and informative read, Superforecasting: The Art and Science of Prediction does not live up to the expectations of being “the most important book on decision making,” as nominated by Wall Street Journal columnist Jason Zweig (2015). Tetlock and Gardner’s incohesive methodology and unjustified rejection of valuable tools, including statistical and economic models that can be deployed leveraging increasingly powerful computers and massive amounts of available data, make it less relevant for the stakeholders and analysts in organizations that rely on these tools in making forecasts. For example, in finance, while it is difficult to justify high-stake decisions with individual-level predictions, efficient and scalable quantitative models are widely used for predicting market trends and identifying profitable investment opportunities. Tetlock and Gardner’s anecdotal storytelling and overemphasis on individual qualities have led to an overlook of the importance of broader institutional factors that may impact the forecasting process and accuracy. For organizations seeking to improve their forecasting capabilities, a more comprehensive approach that considers individual and institutional factors is necessary.

References

Gneezy, U., List, J., & Wu, G. (2004). The uncertainty effect: When a risky prospect is valued less than its worst possible outcome. PsycEXTRA Dataset. https://doi.org/10.1037/e722842011-016

Kahneman, D. & Tversky, A. (1977). Prospect theory: an analysis of decision making under risk. https://doi.org/10.21236/ada045771

Tetlock, P. E. & Gardner, D. (2019). Superforecasting: The art and science of prediction. Random House Business.

Zweig, Jason. (2015, September 26). The trick to making better forecasts. The Wall Street Journal. Retrieved April 3, 2023, from https://www.wsj.com/articles/the-trick-to-making-better-forecasts-1443235983

Is Superforecasting Really Super?

Written by HeyBo