Thoughts and Reflections on My College Football Predictive Model

Alex Elfering

6 min readOct 28, 2018

Key Findings

An ELO-based model predicted the winning game 76% of the time based on score, margin, and home-field advantage.
A Likely Win (50% or greater winning probability) has a margin of 1–25 points 60% of the time. An Unexpected Win (Less than 50% probability) had a margin of 1–10 points 42% of the time.

Problem

A simulation of 54 games between October 25–27 predicted the winning game only 50% of the time. Incorrectly forecasted games beg to ask what can be done in the future to avoid future mistakes.

Solution & Future Exploration

A more robust model that included more variables than just home-field advantage and score would be more insightful for future models.

Those who have known me for a long time know I knew next to nothing about football. I could talk about airlines, politics, and international relations for hours, but the most I could say about football was, “What’s a third down again?”

Things started to change after last summer. I was fresh out of college and getting burnt out on politics and airlines. I still enjoy those subjects, but from an analytics perspective, I was dying to sink my teeth into something new. My Dad, being a life-long Husker Football fan, suggested I look into how the Huskers have performed over the years.

The first dashboard that I made resulted in these dashboards last year that eventually opened the doors to my interest in sports. Not only is it interesting to see how numbers relate to game performance, but it is also an exciting sport to watch. Suffice to say; I can now tell you how downs work in Football.

A couple of weeks ago, I wanted to step outside my comfort zone even further. ESPN and FiveThirtyEight developed fantastic models that predict the outcome of individual games and the likelihood teams reach the playoffs, respectively. I decided to create a model out of inspiration from these two sites. It made me become a more inquisitive and detail-oriented analyst.

The model would utilize ELO ratings to rank teams across seasons and also predict the likely outcome of a game. The full code and documentation are here, but here is how the model would predict game outcomes:

Each team starts with 1500 ELO points in their very first game. If a team wins a match against another, they gain points while the other loses points.
Home teams gain a home-field advantage of 65 points, while matches at neutral sites do not give any benefits.
The probability calculation of a team winning is:

1 ÷ (10^(-EloDiff/400)+1)*

EloDiff is the difference from the team you are predicting against their opponent.

The spread can be predicted by taking the difference of ELO scores and dividing it by 25. Here is how point spreads work.

The organization of the data is:

For each match, the winning team fell under “Team.A”, and the losing team fell under “Team.B”. The “P.A” column predicted the probability that Team.A would win its match, and the update reflected the change in scores for both teams.

Overall, the model predicted roughly 3 out of 4 games correctly in the 2018 season (as of October 20th). The green line identifies “Likely Wins” that had a 50% probability or higher, and the red line describes “Unexpected Wins” that had less than 50%. I thought it was interesting that a little more than 30% of wins in 1918 were unexpected. The likelihood of an underdog victory crept back into the lower-mid 30s again in the late 1950s before again falling back into the mid-upper 20s.

This year, what do these victories look like for teams that were either Likely Wins or Unexpected Wins? As you might guess, Unexpected Wins had a winning margin of 1–10 points 42% of the time so far this season. Surprisingly, 17% of Unexpected Wins had a margin of 20–25 points, but it may look more similar to the last several seasons by the 2018 Championship (next graph below).

Likely Wins have seen a wide range of margin possibilities. Roughly 60% of victories were won with a margin between 1–25 points. Surprisingly, the percent of games winning by a margin range crept from 5% in 40–45 points and rose to 6% in 50+ points.

Winning games between 2007–2017 follow a smoother slope of range possibilities compared to this season so far. Over 55% of Unexpected Wins were won with a margin of 0–10 points. In Likely Wins, a little over 52% were won between a margin of 1–20 points. Overall, these two graphs illustrate a point that the margin can run anywhere between 1–3 touchdowns.

I was interested to see how well the model predicted games between October 25-October 27th, and the next section explains the results from this last weekend.

I used the model to predict the outcomes across 54 different matches across a wide variety of conferences. Overall, the model predicted just 50% correct. The biggest upsets that the model predicted were supposed to be wins included:

TCU vs. Kansas (72% for TCU)
Washington vs. California (71% for Washington)
San Diego State vs. Nevada (65% for San Diego)

The most significant forecasted loses that turned into wins included:

Oregon State vs. Colorado (14% winning for Oregon St)
Arizona State vs. USC (26% winning for ASU)
Georgia Tech vs. Virginia Tech (29% winning for Georgia Tech)

Games predicted incorrectly. The column called p.A gives the winning probability that Team.A defeats Team.B. The odds column says that Team.A wins “x” out of “y” times.

There were positive aspects to the weekend, however. If out of 54 games I miscalled 27 games, I got at least 27 correct (math is weird). I am happy that I called games such as:

UAB vs. UTEP (84% chance winning for UAB)
Clemson vs. Florida St (70% chance winning for Clemson)
Florida vs. Georgia (35% chance winning for Florida)
Utah vs. UCLA (67% chance winning for Utah)

Games predicted correctly. The column called “p.A” gives the winning probability that Team.A defeats Team.B. The “p.A Odds” column says that Team.A wins “x” out of “y” times.

I find a lot of the unexpected wins interesting — not just because that makes the games really interesting. They are also interesting because it makes me go back to the code written out, and ask if anything was missing that could have helped predicted an upset. For instance, although I gave TCU a 72% chance of winning, the Horned Frogs struggled in the game with two of their main players out — one kicked off the team and the other out indefinitely for surgery.

I decided to utilize an ELO model to predict game outcomes because each team’s rating is a reflection of strength on how well they fare against other teams. If Nebraska goes head-to-head with Iowa and loses, then Nebraska will see its rating decrease while Iowa’s rating will increase. The rating system is a zero-sum system. The model rewards underdogs with even more points than if the expected team won.

One of the issues with an ELO rating model is that it does not necessarily consider variables such as weather, penalties, or, for instance, if a starting quarterback is injured (see TCU above). ELO, as Nate Silver put it, is concerned about final scores. Nebraska’s ELO rating is reflective of how the Huskers have fared against different teams throughout many seasons, and that score is insightful in how they might manage against a future game against Ohio State or Wisconsin.

No model can correctly predict the outcome of a tournament. The goal of a probabilistic program is to take into account the random variables of an event and determine the likelihood of a result based on uncertainty. A 72% likelihood of winning still means that the opponent has a 28% chance of winning still. Utilizing the most important variables in future modeling, however, will provide more insight into predictive game performance.

Thoughts and Reflections on My College Football Predictive Model

1 ÷ (10^(-EloDiff/400)+1)*

Written by Alex Elfering