Impact of Age on NFL Player Performance: Survivor Bias (Part 4)

Caleb Smith
4 min readApr 12, 2022

--

Imagine you are a strategist for your country’s military. You have been tasked with analyzing fighter jets returning from war, and deciding where to add additional armor. The red dots on the diagram below represent all the spots where these planes have been shot. Based on this data, which parts of the plane would you choose to reinforce?

Credit for this diagram comes from an article by Jonathan Jarry of McGill University. To read a more in-depth description of Survivorship Bias, click here

This exact problem was posed to American military strategists during World War 2. If you answered “I’d reinforce the spots with red dots!”, you’d be in agreement with these strategists. Unfortunately, you’d both be wrong. The US Air Force was baffled that their new armor wasn’t decreasing the number of planes being shot down. Fed up with their strategists, they turned to a statistician: Abraham Wald. Wald correctly pointed out that their sample of planes was incomplete: it did not contain the planes that had been shot down in battle. Because the only planes being studied were the ones that had survived, the red dots actually represented non-lethal bullet holes. Wald hypothesized that planes hit in spots without red dots were the ones that hadn’t made it back. Therefore, the armor should’ve be placed in these spots all along. Without Wald, the Allies had fallen for a common statistical fallacy: Survivorship Bias.

Survivorship Bias showed up pretty quickly in my analysis of passing data. Using the Quarterbacks that previously qualified for my analysis, I calculated average passing yards by age. The results are shown below:¹

Figure 1

Figure 1 could easily be featured in the classic 1954 book “How to Lie with Statistics.” If someone was trying to convince you that Quarterbacks get dramatically better with age (especially after 43!), this is the picture they’d use. The problem is, while good visualizations make real patterns in data easier to understand, bad visualizations can just as easily be used to mislead. The graph above is missing context. Let’s take a different look at the data:

Figure 2

Since 1978, of the 1,593 times someone has thrown for over 1000 yards in a season, 955 were under 30 years old. Another 621 were between ages 30 and 40. Only one player has “survived” to ages 43 and 44: Tom Brady. And Brady didn’t just throw for over 1000 yards in those two seasons: he threw for 4,633 at age 43 and 5,316 at age 44! This is the “Brady Bump” we see at the tail end of Figure 1.

It is no accident that Brady is the final “surviving” Quarterback. In Figure 2, there is a steady drop-off of qualifying players after the age of 26. To understand why this is, let’s observe the careers of two Quarterbacks: Brett Favre and Joey Harrington:

Figure 3

Figure 3 does a decent jobs of providing a snapshot of these two players’ careers. I have three main takeaways from this graph:

  • Brett Favre was a better player than Joey Harrington.
  • Neither player improved with age. In fact, the trend lines for both provide evidence that both got worse as they got older.
  • Under my definition, Joey Harrington no longer qualified as a Quarterback after turning 30.

If you are familiar with these two players, none of this should surprise you. Favre is in the Hall of Fame, while Harrington never had a winning record as a starter. Now let’s look at a bad visualization that uses the exact same numbers, but in a different way:

Figure 4

Figure 4 is identical to Figure 1, but with two players instead of hundreds, and it shows the same effect. It appears as if performance is improving with age (until the dramatic drop-off at age 41). But we know from Figure 3 that this isn’t true. Performance only appears to be increasing in Figure 4 because the better player “survived” longer than his counterpart. Bad players don’t tend to last long: the NFL will always find someone to replace you. Meanwhile, Hall of Fame caliber players like Favre and Brady can essentially play as long as they want. Of course, there is a spectrum of careers in between. This is why Figure 1 seems to show a steady improvement with age even before the “Brady Bump”: more players fall off every year, leaving only the best of the best behind.

There is another issue with all the graphs above: counting stats like “Yards” are not a great indicator of how good a player is. In Part 5, I will talk about one of the strangest seasons by a player in NFL history, and use it to illustrate why efficiency metrics are superior to counting stats.

Footnotes

1: The coding for this article (including all the graphs) can be found here.

--

--

Caleb Smith

Stats nerd with an interest in sports, politics, travel, and economics