The Key Features to Beating the Odds

9 min readMay 21, 2022

Colby Hawker, Kevin Lee

A Streamlit application to provide NFL bettors with information to make safer money-line decisions

In this article, we focus on a novel approach to decision-making in NFL bets. First, we train a predictive model with key features that sportsbooks may overlook outside of common NFL player and team metrics. Secondly, we approach game picking as a ‘basket of all options’ rather than the common approach of predicting a series of games anecdotally. Finally, we implement a novel “swarm” decision-making technique to assist sports pickers to maximize both win rates and earnings. Theoretically, there is a trade-off between the two objectives (win rates and earnings), since higher probability games lead to less earnings fundamentally. Our goal is to simulate bets throughout historical NFL seasons to achieve net positive earnings with a win rate that exceeds the average game-by-game precision of Sportsbooks (~65%) over the course of the season. We provide an application to evaluate tradeoffs between money risk and probability of wins. One of our core assumptions is that, collectively, people tend to be able to see match-ups beyond the metrics, considering intangibles such as fatigue, motivation, momentum, and many other factors. We can draw on outside insights and get a high level view of only the best bets.

Our Swarm-Inspired Decision Architecture

Artificial swarm intelligence is a type of evolutionary computation inspired by organisms that solve optimization problems as a collective intelligence. SI (swarm intelligence) can be used for either MOEA (multi-objective) or single-objective decision problems, although we will focus on the latter for our application. Some organisms make group decisions so effectively in “swarms”, they form human-level intelligence for certain tasks. I’ll briefly describe one of these “swarms” that we’ll emulate in our application and the methods that seem to make them so smart. After illustrating the swarm intelligence of these animals, I’ll also briefly touch on how humans make collective decisions in a group setting. We’ll examine the question, “how do humans compare to “herds” when making decisions as a group?”

When we think of how humans make decisions collectively, we can think of offline and online collaboration. Polls and surveys sometimes are inaccurate because there are no interaction variables that may help produce a better result. Let’s take the example of a simple poll to determine the best location to have a team dinner. There may be some options that are completely infeasible (ie. allergies) that a vote count would never account for because of hidden factors behind the vote. What if the majority rule, but two team members don’t show up because they absolutely cannot eat the food. Discussions can help here, but there’s still a drawback.

Pseudo code for an artificial bee colony

Forums and group meetings can either (a) leave key voices unheard, (b) overweight certain voices that dominate the room, or result in a gridlock with no decision made. Even offline forums like product review can lead to social influence bias, influencing how someone rates a product. This relates to anchoring bias, where humans rely too much on initial information. If humans were to follow the methodology of the honeybee swarm behaviors listed earlier, experience and overall conviction would add more weight to any given preference. Experience and conviction adds more weight to a given choice, and humans could overcome biases if individual opinions could be kept in check by a collective swarm.

Interactive human swarm optimization would bring in elements of both a posteriori and a priori decision-making. One of the main factors in an interactive swarm is that a user would start with a priori information about the problem and objective, and on subsequent iterations from the “swarm”, they will have a posterori information about previously acquired “tentative” solutions.

The decision architecture loosely follows the honeybee model. The first step is a general survey of the ‘basket of potential picks’, and analyzing the key factors that will go into the decision. With honeybees finding a potential food source, there are many different criteria (objectives) that will be used to make a decision, and this needs to be noted by each of the employee bees. Similarly, we collect the key information that stakeholders will need to make a decision, and there’s a tradeoff. The higher the Sportsbook probability of a team winning, the higher the risk a bettor must take on if betting on that team (set by the money-line). We therefore provide two critical pieces of information, (1) our own computed probability, and (2) the money-line risk ratio. Any bets (games) that have both a lower probability of a win and higher upfront money risk will be excluded from the Pareto optimal list. The user of the application can then narrow down the list of options based on their own preferences, just as an onlooker bee would. Finally, just as a honeybee hive makes a consensus pick based on the conviction of scouts, so too will we rely on a consensus pick. In NFL sports betting, betting trends are sometimes made available by sportsbooks. This tells us the distribution of the bettors and money on a given game. Bettors will sometimes watch betting trends to influence their own decision, as this is a proxy metric of conviction. For this reason sports betting trends are often considered a ‘consensus pick’. We will use this consensus pick as the final decision once we have already narrowed the subset of alternatives.

Feature Analysis

Features De-scoped for this project

“Trap Games” — Debunked sports myth with no solid metrics to predict
Player Injuries and impact to the game
Madden NFL Video Game ratings
Fantasy Football individual player projection sums for each team
Reddit poll of experts

Features Implemented for this project

2021 Sportsbook-determined money-lines and spreads — Serves as the basis for analysis
2021 Betting and Handle percentage volumes
NFL Stadium data
Weather data
Historical results of past NFL games

We ran a feature analysis by identifying which features were selected most frequently in our Random Forest decision model. This is an ensemble of decision trees that picks the most optimal features at each partition of decision branches. The more a feature is chosen, the higher the prediction power.

The following graph shows the weights of our different features:

Intuitively, the predicted spread was the largest predictor, followed by altitude advantage. Important to note that a decision tree will factor in interaction variables (ie. team is home and it’s also high wind neutralizing advantage, etc).

Results

The results of the Streamlit application and its methodology produced positive results for potential stakeholders. We can achieve much higher win rates than a prediction models (ie. our 80%+ vs a 70% benchmark) because we’re being selective with the games that we bet on, rather than making predictions on every game. However, it’s the optimization portion that seems to (a) safeguards money losses, and (b) achieve more consistent accuracy.

When analyzing the outputs of different methods, it’s helpful to keep some benchmarks in mind:

Based on the performance data set used, sourced from Kaggle

Note that Sportsbooks basically make the right prediction 65% of the time, and some top models can get just beyond 70%, which is impressive when scaled across hundreds of games.

Remember, we are seeking to pick the very best game out of each week, so we intended on outperforming both of these metrics. We ran an analysis on our system with specific parameters to determine the level of success. The analysis found that just using some of the “insider” betting trends can perform with a high win rate (82%+) out of the weekly basket of games. In other words, we just let the crowd determine the win with high success, an important part of the ‘swarm’ methodology. However, this result does not include the tradespace in which we weigh the money-line risk vs the probability of win. The “Best Pareto” includes the tradespace and yield different results.

When selecting from the Pareto optimal set of games a boundary set at a 70% probability, and combining it with the Betting %, the system produced an amazing 89% accuracy and a net payout of $1,005 from a median weekly bet size of ~$100 (see blue point on graph).

Hypothesis

There’s a huge difference in payout between using the betting percentage alone vs our Pareto optimal approach (weighing probability and money-line beforehand) . Although precision was similar, the latter approach likely prevented some bad bets because we’re throwing out games that are both lower probability and higher money risk.

Streamlit Application

The Streamlit application is quite simple to use, as much of the data, at least for testing purposed for this project, are included in the GitHub in which this application pulls from (chawk89/SYSEN5160). The application and more details like a technical report (PDF) can be found here.

First, the user picks which NFL week to analyze for the best money-line bet. This application was not able to be tested with real-time NFL season data, as it was developed during the offseason; however, many historical games are available for testing and playing around.

Choosing the week to analyze in the Streamlit application

Then the user gets the best bets of the week based off of several features that were determined to be most correlated to the home team winning.

Getting the best bets of the week in the Streamlit application

Pareto Optimal front analyzed by the tool to determine the best bets based off of the analyzed features

Then finally, the user can get the wisdom of the crowd through the “swarm intelligence” methodology. For this project being developed in a short amount of time with limited resources, and the NFL being off-season, historical betting % volume trends were used to emulate the “swarm” methodology. Like mentioned previously, this method actually has almost a 90% accuracy rate.

The best team to bet on the money-line for. (For 2021 Week 2, the Denver Broncos actually won!)

Moving Forward

Like mentioned previously, there was limited time, resources, and data available to us when developing this project. Some other improvements we would have loved to make to this project, but simply didn’t get to are:

Test using real-time data during NFL season
NFL Playoff analysis, optimization, and prediction
FiveThirtyEight QB ELO may provide to be a very useful dataset
Other forms of betting outside of money-line
Other sports such as NCAA football

Real-time data would be a very interesting case to test, as this application is developed for real-time use to optimize the data trends to predict the best money-line bet for the stakeholder/user. This tool currently only analyzes regular season data; however, playoff data would be just as interesting. Ty Walters scored a 72% accuracy rate when predicting the outcome of every NFL game in the Kaggle 2020 competition using the FiveThirtyEight QB ELO dataset, which would be interesting to implement into this application. And finally, other forms of betting outside of the money-line like spreads, over/under, etc. would be interesting as well, as not everyone is interested in only outright winners for NFL games. Increasing the ability to analyze, optimize, and predict bets for other sports would add to the usefulness of this application.