How can FansUnite Offer 1% Margins?

FansUnite
FansUnite

--

In the financial industry, a 1% margin is considered very high because of the efficiency in capital markets. With a consistent and large stream of financial information come highly informed customers, who will casually shop elsewhere and drive the margin cost of transactions downwards.

Entire farms of literal rocket scientists are hired by the finance industry to make sophisticated models that shave pennies off the dollar. The only reason they haven’t done so in sports is because the money is modest when compared to global capital markets.

How many PhD’s can you cram into a server room?

Sports markets have greater uncertainty than currency markets, but as data has proliferated over the last 2 decades, and become near instantaneous in delivery, the average margins for bookmakers has creeped downwards to 2.4% in the current year.

In general, as margins decrease on speculative assets, total global volume goes through the roof. The question now for the sports betting industry is how far can the High Volume Low Margin (HVLM) model be driven down? There are three ways to do this:

  • Exploit massive streaming transaction data to improve the HVLM model;
  • Non-existent startup costs allow for more competition and lower margins; and
  • Using deep learning to detect market bias.

Thankfully the blockchain solves point 1 and point 2, but to offer a truly historic 1% margin, you have to trade efficiently against your customers.

Machine Learning is very valuable in the bookmaking industry because we can use it to detect human bias. Building a single predictive model is not very helpful when the data is noisy- and sports modelling is very noisy. In fact, between 20% to 36% of all sports betting results are attributable to sheer luck, depending on the sport type.

The scale of luckboxing

For the most part, the vast majority of hobby sports modellers are just sampling and predicting noise. However, when you use an ensemble of betting agents to make predictions as described previously, you can actually simulate how humans make predictions by inducing intentional overfitting. (fig 1. All MLB Data Courtesy of MLBAM Inc.)

fig.1 Distributions of odds

When odds are chosen by humans who model random noise, you get a lot of spread over 2430 mlb moneylines.

When we use an ensemble of deep learning classifiers and intentionally overfit the models (no cross validation), we see the distribution of odds closely resembles the real one.

Using a cross validated ensemble with bagging and dropout gives us a much more rational spread of odds over the 2430 games.

Ensemble models that generalize well discard bias by eliminating contributions from agents that overfit noise and smooth out the noise effects inherent in the data.

The “bettor bias problem” is one that the HVLM model distinctly does not solve. In the HVLM model, bookmakers need to gather enough market opinions at various prices to raise limits. Relying on market opinions to accumulate before limits can rise carries with it the irrational bias in the participants. Large distortions in the the price can lead directly to arbitrage traps where you take on large scale unwanted liability.

However, if you understand what the price ought to be and you are sure about that because you spun up 1M+ predictive agents, you can instead react to irrational price-taking, pitting the human beings ability to simulate data against a large scale machine learning ensemble array. I know what I would bet on.

In some tasks like :

Machine Learning dominates human agents in large, noisy datasets used to solve problems. One of the arguments against a 1% book using machine learning may be that simple statistical arguments might be able to achieve results comparable to the results of a more complex ensemble of neural nets. Humans can just run a logistic regression and get most of the way there anyway.

It is much more likely, however, that 99% of people who build models are overfitting pure noise, testing the model, betting randomly, losing and then trying another model from scratch. The only way to test if a model is consistent is to test individual non-bagged neural nets and take them on random walks through 5000+ forward predicted games. The probabilistic borders of these walks form the boundary of random bettors. If you are picking up bias, you ought to be performing such that you are 95% sure your performance is not a random walk.(fig. 3)

FIGURE 3. Profit projections for a betting simulation for MLB seasons 2016, 2017 and 2018 using an ensemble of NNETs and the market (Pinnacle) closing lines. The other colors represent bet segmentation and confidence filtering on noisy predictions. The light-green band represent the 95% CI obtained from 500 simulations of random betting (either choose home, away or do not bet on any team).

Surprisingly an ensemble almost exactly predicts random betting (blue time series). Remarkable in its symmetry to pinnacles excellent model this makes sense; we haven’t filtered out noise! However when you segment the predictions such that you only consider those that are not likely to be generated by random noise, value betting outright destroys the closing lines (red and orange line).

Further segmenting the bets, you remarkably find almost all losses come from underdog predictions that are value bets and the profit is Pareto in the profitable regimes where the ensemble picks the favorite and the odds are greater than 2.0.

FIGURE 3. A strong ensemble uncovers the two extents of bias. Bettors often misunderstand the asymmetry of underdog odds (case 2) and overvalue a home team or heavy favorite when in fact their data suggests they will be challenged by the perceived underdog. (case 3) In the case of value on favorites, its very hard to overcome the markets efficiency (Case 1)

This analysis shows that further improvements can be made on the purely market driven approach simply by filtering bets into segmented regimes. In effect, pick off the biased lines by implementing a strategy based from data alone (black curve). Conversely, a sportsbook would set prices and happily accept biased bets and attempts to shift the line (arbitrage) if those bets are systematically irrational. We have no fear of trading with the customers, and we feel that is actually much better for the customer in the end to do so.

Fansunite.bet is an experimental sportsbook designed to push the efficiency of betting lines to its natural limit with the fansunite.com Protocol on the Ethereum blockchain.

Thanks to the machine efficiencies of a carefully tuned ensemble of neural nets we can be just a little bit sharper then typical models using weighted prices and some in-house proprietary approach which are perhaps over-fitting noise.

Our approach benefits from the elimination of bias in the mass of little machine opinions on what the prices ought to be. Basically, we don’t model noise as much as your average hobby modeler because we don’t impose any conditions on the analysis of data. We just let the agents learn and generalize and with bagging, make sure each agent holds its “opinion” to account on unseen data.

We are truly excited to deliver to the fans a historically small 1% margin and push the envelope to build the most efficient type of bookmaker out there.

Stephen Rothwell
FansUnite HeadTrader and Machine Learning Specialist

Learn More About FansUnite:

--

--