Bayesian Leanings for Fun and/or Profit

A betting man’s game

Published in

From the Diaries of John Henry

11 min readOct 13, 2016

Having recently lived through the experience of hunkering down for a hurricane that never came (at least to my neighborhood), I found myself with a little time on my hands to ponder the art of prediction, for while the eye may have remained offshore for the most part, there was definitely enough wind in the outer bands to knock down a few power lines here and there (the outages were short-lived, thank you for your concern).

The Northeast blackouts of 2003 it was not.

Please note that for purposes of this post I will follow the philosophy of ‘only writing what you know’ (possibly at the risk of devolving into the kind of semi-delusional free form word association ramble of a Trump rally but hey nothing ventured nothing gained) for while search engines may be a powerful tool for uncovering new information, any author who relies primarily on keyword results for talking points may not be the best candidate for writing on a subject. I only mention this because it is likely (certain) that there are are some gaping holes in my knowledge here, but trust me this practice will make this post all the more entertaining for any interested party to pick apart. (Since every rule was meant to be broken at least once, I will make an exception for books or papers that I have previously read).

The image of hurricane Matthew forecasts is a fun example of times when forecasts diverge — based on which model you choose to believe the storm could have been heading for Canada or Mexico. There are very few public examples where the inner workings of a prediction model are hinted at to the extent that we see in each of these tracking models and their resulting aggregation. Generally, the public sees only a model’s single figure output, possibly tracked in a single line curve over an axis of time — the expected temperature, the stock price, or the probability of a politician’s election. What we find in this hurricane forecast is much more nuanced aggregation of individual models expressed with probability band for position over time with wind speed category watches and warnings. For those who actually make (or attempt to make) a living off of forecasting, these more elaborate models give real advantage over say Joe Schmo on the street. The true informational value for a series of forecasts is sometimes less the actual values themselves than their first and second order / velocity and acceleration of change — while there may be a wide uncertainty band around a single point prediction, the same may not be the case for these derivatives.

The weather forecasts I imagine present an interesting challenge given the wide band of dimensions and variables at play (temperature, humidity, pressure, wind speed, ad nauseam), but particularly because the extent of variables are present not only in the input parameters but also the output parameters — just as how a quantum computer’s computational capacity is a function of the number of qubits as 2^n, increasing the number of elements in a forecast range quickly climbs computation cycles into the stratosphere of exponentiality.

Fortunately for meteorologists, the interactions of a weather model are well known (some combination of “simple” physics with current and historical weather patterns and trends based on the time scale of forecast) so the biggest limiting factor to forecast accuracy is usually the scale of grid granularity — it is much more computationally intense (but also more accurate) to model a weather system down to a scale of cubic meter verses cubic mile — unfortunately for those interested in certainty for planning their weekend at the beach getaway, the weather is a system with chaotic elements where even seemingly insignificant fluctuations below scale of the model can have macro effects — the proverbial butterfly flapping its wings in Tallahassee causing a tornado in Kansas (Todo?).

I suspect the wide divergence of hurricane Matthew predictions shown here are partly because of the slow speed of his crawl made its path more susceptible to the influence of chaotic elements, hence the different models under aggregation, each with their own variation of processing power at play, diverged.

Babbage’s difference engine from the Computer History Museum: Link

Weather is somewhat unique in typical subjects of forecast in the extent to which the governing equations and parameters are known and defined. A more typical forecast subject has a high degree of either invisible internal parameters or is under extra-model / outside influence such as from black swans. In these cases forecasters can attempt to recapture some semblance of manageability by limiting the output of their model to a single variable of interest. Even with this constraint there still remain some potential variations in tractability, one key being the difference between an unconstrained “vanilla” variable output such as a stock option price verses a variable limited to a final value of 0 or 1 such as for a binary prediction market. Taleb and Tetlock explored the differences between these positions in a 2013 paper, with key takeaways including the potential error of application in addressing a vanilla risk with a binary hedge — the two cary very different risk profiles due to their potential wide divergence in payoffs in the tail.

Source: **On the Difference between Binary Prediction and True Exposure With Implications For Forecasting Tournaments and Decision Making Research** — Nassim N. Taleb, Philip E. Tetlock

The binary constraint allows a forecaster a coarse graining of focus — by limiting the prediction to a concrete binary outcome the model can effectively reduce the field of play to a more manageable range — the removal of the payoff value from the equation particularly benefits tractability. Note that for the example of a stock option, even though it appears by precursory definition a binary type from the standpoint of a stock is either above or below the strike price on closing date, it is only truly bounded on the ‘out-of-the-money’ side, and the value of a payoff can fluctuate greatly on the money side (this is admittedly option-trading 101 basic stuff here). It is possible to clip the tails of option exposure using a combination of call and put trades, however in so doing a trader loses his potential for an antifragile strategy i.e. to gain from volatility and disorder. If there is any rule of thumb for option trading it is certainly preferred for trades with capped losses and uncapped gains (antifragile) vs the fragile approach of capped gains and uncapped losses — the proverbial picking up pennies in front of a steamroller. A binary trade precludes any type of antifragile strategy.

While it is entirely possible to build a security trading strategy using binary trades, my exposure has primary centered around the prediction market variety. In a prediction market bettors buy and sell instruments based on the probability that a trading question will close as a yes or a no, thus until time of question resolution an instrument can range in price from 0–1 (or perhaps 0–100%) with the aggregation of bets revealed through the market price of an instrument potentially serving as a useful tool for estimating the probability of an occurrence. Some examples of candidates for a binary prediction market could include political questions such as the outcome of an election, financial questions such as whether a company will meet sales forecasts, geopolitical questions such as whether a conflict will resolve or transpire, or perhaps even the less intense such as outcome of a sporting event.

Source: FiveThirtyEight 2016 Election Forecast Oct 13, 2016

It is this type of market that inspired the title of this post, for one popular trading strategy in the binary market is to use the crowd’s aggregate prediction as a prior probability coupled with some new piece of information that the crowd may not have fully taken into account in a bayesian belief update. For an easy introduction to bayesian belief betting I recommend Nate Silver’s book The Signal and the Noise, and am providing here an illustrative excerpt from chapter 8 which clearly demonstrates how the addition of some new piece of information can be used to shift a probability judgment from prior to posterior.

Source: Nate Silver — The Signal and the Noise

Funny enough in the Tetlock and Gardner book Superforecasting (named after their term for people who perform above average in prediction market betting tournaments), when they describe some of the traits that make for a superforecaster they suggest how while these expert forecasters typically use the Bayesian principle as a type of way to guide their thoughts on a bet, these supers rarely go so far as to compute out the exact posterior probability as would be generated by the formula. Given the degree of uncertainty generally found in a topic under debate this might not be surprising, however in Mauboussin’s and Callahan’s paper Sharpening Your Forecasting Skills, they suggest that superforecasters generally provide more finely detailed forecasts than other forecasters, pointing out that the more granular forecasts were generally more accurate than less granular ones (e.g. a forecast with one significant figure such as 90% will generally be less accurate than one with two significant figures such as 87%). If this is the case I know of no better way for a bettor to develop precision than through the discipline of actually performing a detailed Bayesian derivation. I would suggest to Tetlock that his findings that superforecasters generally did not take this extra step may be less a indication of the model’s usefulness for forecasting than a weakness of the incentive scheme in place for the tournaments he has run under his studies to encourage skilled participants to take that extra step — after all while the Good Judgment Project, the original team which did so well in a tournament between researchers, may have had a nominal payment to forecasters as a token renumeration, the payments had no tie to a bettor’s performance, to make matters worse his current attempt at prediction markets (Good Judgment Open) are volunteer only, participants have no skin in the game other than reputational currency. To some that may be enough, having your name on the line is a pale comparison to having your neck on the line but at least it’s something. However that only really works if the reputational currency of participation in the tournament translates into some interaction or opportunity beyond simply visibility on a betting platform — otherwise a bettor may find that after literally years of plugging away, one bet after another, he doesn’t really have anything material to show for it. That is not to say that I wouldn’t recommend volunteering for a trial participation for those with desire to hone their probabilistic reasoning, for the tournament definitely accomplishes that through both training and practice. Another viable reasoning for joining is simply the shear fun factor that ranked interpretation brings to watching the news and keeping up with current events. It is a particular fun way to hone in your “google-fu” online research skills which is likely to come in handy no matter your occupation — all stairways lead to knowledge worker if climbed high enough. However if Tetlock and his team desire to truly keep their forecasters engaged for the long term they will need to find some way to build in some skin in the game.

The easiest way to grant forecasters some skin in the game would obviously be to incorporate actual dollars as the currency of betting. The website InTrade did actually gain some success with this model for a while but was eventually shut down in the US. I am not certain of the full rational for why the model was banned, but I speculate that it may have either had to do with aversions to gambling or possibly even the risk associated with allowing people to make money off of unfortunate events — just as how activist short sale investors will campaign to see a drop in a company’s stock price, I’m not sure it serves society to have activist prediction market investors campaigning against some desirable occurrence (as a silly hypothetical example say there was a market betting on the number of burglaries in a city — would we really want to incentivize activist bettors to facilitate additional burglaries to lock in a profitable trade?). There are ways around the ban on prediction market betting, a notable being shifting the currency to a digital variety such as BitCoin. When Ron Bernstein, one of the cofounders of InTrade, was asked why he did not take that approach himself, he replied that “We’ve learned from our own experience that regulatory avoidance is not a good business model.”

Source: The Fall Of Intrade And The Business Of Betting On Real Life

For those superforecasters who do take their hobby seriously and strive to possibly even make a career out of it some day, I caution you that like most jobs in our economy, the robots are coming after yours too. Even today it has been demonstrated that a robotic betting platform, by monitoring information streams such as news and social media, has been able to predict with surprising accuracy outbreaks of civil unrest in South America.

Source: Virginia Tech — EMBERS research project: Link

It is possible that human bettors will keep their foot in the prediction door by marrying their skills with the robots — sort of the equivalent of “centaur” chess teams wherein humans are allowed to compete in conjunction with robotic allies. In online gambling sites for poker, fantasy sports, etc, I am guessing that those who consistently outperform probably have some version of this strategy in place today. However if superforecasters really want to stay engaged at the center of the betting world I would point them to the Kevin Kelly book The Inevitable, where the author postulates that the last skill the robots are likely to learn is not how to answer a question (which they’re already quite good at now), it is how to ask a good question. For the usefulness of a prediction market, the ability to influence decision makers and illustrate key trends, can only really be realized when bettors are researching a well crafted question. Bettors, even superforecasters and/or their robots, are interchangeable. Given enough forecasters of sufficient skill no single bettor is irreplaceable, however those organizers who can ask piercing and important questions are the ones who individually can really facilitate the value of a betting platform.

Mason Williams — Classical Gas (Smothers Brothers performance)

Suggested further reading: click on the embedded links