Impact of Age on NFL Player Performance: Does Position Matter? (Part 3)
I grew up less than an hour from the City of Jeannette. Like many small manufacturing towns in Western Pennsylvania, High School football is king. In fact, no program in Western PA football history has won as many games as the Jayhawks. But throughout the storied history of their program, one name stands out: Terrelle Pryor. After leading Jeannette to the state championship in 2008, Pryor was listed by Rivals as the number 1 recruit in the country, ahead of future NFL Hall of Famers Julio Jones and Patrick Peterson. Yet after an up and down 7-year professional career, no one would still compare him to these all-time greats. So what happened?
From the beginning, Pryor was destined to have a strange career. Due to receiving improper benefits from a tattoo parlor while attending Ohio State, he was forced into the obscure Supplemental Draft, where he was taken by the Oakland Raiders in 2011. Drafted as a Quarterback, he only started 10 games at the position. Nine of these came in 2013, when he threw for nearly 1800 yards. After being released and not playing a single down in 2014, Pryor made the shift to Wide Receiver. He bounced around the league before breaking out with the Cleveland Browns in 2016, catching 77 passes for 1007 yards. In doing so, he became only the second player in league history with both a 1000 yard passing and receiving season. He would never hit either of those numbers again.
One of the most famous concepts in statistics is the “null hypothesis.” This is the assumption that any anomalies in the data can be accounted for by random chance. In the context of our question, the null hypothesis is that position has no impact on aging. This assumption should make even the most casual football fan uneasy. The conventional wisdom is that Running Backs (who take a constant physical beating) don’t last as long as Kickers (who don’t). Let me be clear: the null hypothesis is not always correct, but the logic behind it is sound. We can’t just take the conventional wisdom at face value. It is the data scientist’s job to provide evidence that position impacts aging. Otherwise, we need to treat every position’s aging process the same.¹
To test this assumption, we need to group players into positions. The problem is that a player’s position is not always clear. The obvious cases are guys who have played multiple, like Terrelle Pryor. Was he a Quarterback or a Receiver? Next, you have guys who played one position, but have traits of another. LaDainian Tomlinson was a Running Back who caught 100 passes in 2003, and Michael Vick was a Quarterback who ran for 1000 yards in 2006. This is an issue because accurate position data is hard to come by. Below is a snippet from the complete 2021 passing records compiled by Pro Football Reference (PFR):
Two players above are listed without positions: Kyle Allen and Chad Henne. Diehard fans will recognize them as backup quarterbacks, and may be tempted to fill their position by hand. But missing position data is an issue throughout PFR, and manually filling in thousands of records dating back to 1978 would be nearly impossible. Another quirk in the data: Cedrick Wilson Jr. is a Receiver who threw for more yards than Henne (and several other QBs listed further down). This brings into question the usefulness of comparing backups like Henne to full-time starters, even those who technically play the same position. Starters and backups are fundamentally different types of players, so their aging processes may be different.
In order to explore the effect that position has on aging, I sorted every player since 1978 using the decision tree below:
In Part 2, I stated my positions of interest going forward would be Quarterbacks, Running Backs, Receivers,² and Kickers. Since 1978, 22 Quarterbacks have run for at least 500 yards in a season, 107 Running Backs have gone over 500 yards receiving, and one player (Pryor) hit all three thresholds.³ Each season, these players were assigned a position by the tree. Some players are considered to have played different positions in different years. For instance, Pryor is classified as a Quarterback in 2013, a Receiver in 2016, and a Backup for the other 5 years of his career. Using these classifications, some interesting patterns begin to emerge:
The peak (or mode) of each distribution is around 25 (24 for Running Backs, 25 for Receivers and Kickers, and 26 for Quarterbacks).⁴ However, the distributions for Quarterbacks and Kickers are far more skewed than the distributions for Running Backs and Receivers. This means Quarterbacks and Kickers are more likely to play into their late 30s and 40s.
The biggest flaw of comparing the histograms above is the difference in the number of total qualifiers at each position. For example, there have been 3,121 instances of a Receiver with 500 yards in a season since 1978, but only 1,617 instances of a Running Back with 500 yards. Therefore, to make a useful comparison of the two positions, we need change these numbers to percentages. More 24-year-old Receivers had 500 yards than 24-year-old Running Backs (346 vs 240, respectively). But 15% of 500 yard Running Backs were 24, as opposed to only 11% of 500 Yard Receivers.⁵ This can be visualized below:
This graph shows that Kickers tend to last longer than Quarterbacks, who in turn last longer than Receivers and Running Backs. While Running Backs have the highest peak at 24 years old, no 500 yard rushers were older than 37. Meanwhile, Kickers have the lowest peak, but the longest tail (going all the way out to age 47). In fact, almost 4% of qualifying Kickers were in their 40s!⁶
The data clearly show that position has an effect on aging. Thus, we can safely reject the null hypothesis: the assumption that position doesn’t matter was shown to be false.⁷ However, the extent of this effect has not yet been demonstrated. We have not shown that 25-year-old Receivers are better than 35-year-olds, only that there are more of them. In fact, there are some surprising results that seem to contradict the idea that aging negatively affects NFL players. In Part 4, we will examine some strange findings, and discuss an important statistical concept: survivor bias.
Footnotes
1: My goal here is to introduce the concept of the null hypothesis, not the math behind it. My calculations for hypothesis testing (and all other Python code for this article) can be found here.
2: “Receivers” includes both Wide Receivers and Tight Ends. Because of the missing position data, it was too difficult to separate these two groups. Although this isn’t ideal, the two positions are similar enough for the purposes of exploratory analysis.
3: No Kickers qualified as any other position.
4: Mode is the value of the distribution that appears the most, and is represented in each histogram by the highest bar. For example, there are 150 26-year-old QBs with 1000 yard passing seasons, which is more than any other age.
5: The math here is simple: 240/1617 = 14.8%, while 346/3121 = 11.1%.
6: As opposed to about 1% of Quarterbacks, 0.1% of Receivers, and no Running Backs.
7: For the stats nerds: the p-value of the ANOVA test was effectively 0. The most similar positions in terms of aging were QBs and Kickers, but the difference between them was still significant at the 5% level (p = 0.0118).