Do you make this mistake in following football?

Avoid the pitfalls of the Zeroth Law of sports analytics

In his first game as an NFL head coach, Josh McDaniels and his Denver Broncos faced a 7–6 deficit with 38 seconds remaining in the game. The Cincinnati Bengals had just scored a touchdown and had the Broncos pinned deep at their own 13 yard line.

On 2nd down, Kyle Orton threw towards the left sideline. The pass got tipped by a defender but fell into the hands of Denver’s Brandon Stokley, who ran into the end zone for the winning score. The play became immortalized as the Immaculate Deflection.

The 2009 Denver Broncos won their first 6 games. This impressive start prompted ESPN’s Tom Jackson to declare McDaniels “one of the great ones.”

The 2009 Denver Broncos finished the season 2–8. They missed the playoffs when they lost to the Kansas City Chiefs, a team that finished 4–12.

The 2010 Broncos started 3–9. McDaniels was fired.

If sports analytics can teach you only one thing, it should be this: never make a judgement based on small sample size. Think of this as the Zeroth Law of sports analytics.

Let’s look at the foundation of this law.

How Coach Average fares in the NFL

Football is inherently random. As the Immaculate Deflection shows, a team can win on a lucky bounce when they’ve only mustered two field goals the entire game.

The outcome of a football game is not that different from the flipping a coin. To show the fallacy of looking at a small sample size, let’s use coin flipping as a model for how Coach Average performs in the NFL.

Using a random number generator, I generated these results for Coach Average’s first 50 games based on his 50% chance of winning each game.

Just for the record, I only made this random sequence once using eight lines of Python. There was no effort to find a sequence that had 6 wins in a row.

Coach Average ripped off a sequence of 9 straight wins starting in game 19. Tom Jackson would be starting his petition for the Hall of Fame.

Moreover, Coach Average wins 31 of his first 50 games, 6 more than the expected 25 wins. Only coaches like Bill Belichick have a better career winning percentage than the 62% of Coach Average. He looks extraordinary even over a sample of 50 games.

I generated a sequence of 200 coin flips, not knowing how many would fit in the visual. Coach Average won 106 of those 200 games for a 53% winning percentage.

With a bigger number of coin flips, the winning percentage gets closer to the expected 50%. That’s the consequence of the Law of Large Numbers, the mathematical reason you should never draw conclusions based on small sample size.

The famous “Hot Hand” paper

In 1985, Amos Tversky, a Stanford psychology professor, and his colleagues published a paper called The Hot Hand in Basketball: On the Misperception of Random Sequences. They found two key results relevant to Coach Average.

First, they looked sequences of made and missed baskets for two NBA teams and asked whether it looked different from the random flipping of a coin. Does a made basket implied the next basket is more likely to go in? No. Were there more streaks of made baskets than one would expect from random? No.

The sequence of made and missed baskets looked like a random sequence, much like our coin flipping model for Coach Average.

Second, they did a survey in which they gave people a sequence of X’s and O’s to represent made and missed baskets respectively. This experiment isolates the random sequence from anything sports related.

In a truly random sequence, an X follows an X with 50% likelihood. For these sequences, only 32% of participants called this random shooting while 62% called this streak shooting.

People tend see streaks in randomness, just like you probably saw patterns in the wins and losses of Coach Average.

Tversky and coworkers also generated sequences in which the likelihood of getting an X after an X was less than 50%, or sequences with a higher tendency to alternate between X and O. The participants were more likely to call these sequences random than streaky.

Even without all the biases inherent in sports, people see patterns in randomness.

Emotion versus logic

On an intellectual level, you see the problem with small sample size. However, the problem arises on the emotional level, especially with the ultra short season in college football.

In 2015, Utah started the season 6–0. They rose to 3rd in the AP college football poll and seemed like a legitimate playoff contender from the Pac-12.

Numbers suggested Utah might be overrated, and I wrote about this in my column on Bleacher Report. This brought a strong backlash from Utah fans on Twitter.

this guy is an idiot and his reasoning is terrible

Utah would lose 3 games and not even win their division in the Pac-12. However, a situation later in the 2015 season shows how emotions can rule over logic in college football.

After a 7–0 start, LSU lost three games to SEC West opponents. No matter the quality of competition, this skid put coach Les Miles on the hot seat. According to the media, Miles would be fired after their last regular season game against Texas A&M.

However, the LSU administration changed their mind during the 3rd quarter of the game. Perhaps they saw the logic in the Law of Large Numbers. Over his 11 years at LSU, Miles averaged over 10 wins per year with a remarkable 78% winning percentage.

Nah. According to Bleacher Report, it was something else that saved Miles’ job.

There was an energy, some sort of new coach elixir, in the air. Miles’ introduction to the crowd was an event that captured the full applause of the stadium. It was passed around through social media. As LSU took control of a contest that tumbled along, it built up a bit more. By the time the clock had drained and those in the stands poured their hearts and souls into the air, and the players grabbed their head coach like they had just clinched a College Football Playoff berth, it all made sense.

If this seems like a cow turd story made up by a reporter, consider this description from ESPN.

There was the emotional Miles love fest at his Wednesday call-in show, with children and adults alike shedding tears over the possibility that this week would be it for their popular coach. They showed up in full force as the team marched into Tiger Stadium before Saturday’s game, with Miles remarking on LSU’s pre-game radio show that it “was as deep a crowd as I’ve ever seen on Victory Hill.”

Emotions play a big role in following your football team. To not let them take over, remember the lessons of Coach Average and try to not judge based on small sample size.

Ed Feng studied applied probability in earning his Ph.D. from Stanford. He founded the sports analytics site The Power Rank.