Next Gen Stats, in some context

The NFL releases Next Gen Stats, but doesn’t do a great job explaining, testing, or doing much else with them.

Dan Pizzuta
Off Coverage
5 min readAug 28, 2018

--

Numbers have become an increasingly prevalent part of digesting and analyzing sports. Some sports, like baseball, have embraced the impact numbers have on the game while others, like football, have been a little slower in adapting. But the NFL is trying… in it’s own way.

Over the past few seasons the NFL has produces its own Next Gen Stats, though what’s been made publicly available is just a portion of the data set and lacks a real explanation of what those numbers mean or if they’re really of any use at all. Some of what the NFL offers are top-20 lists that don’t do much of anything like the 20 fastest sacks or 20 longest tackles.

So what follows are a few findings from digging into Next Gen Stats at the positional level to see what, if anything matters:

Passing

For quarterbacks, the main Next Gen Stats are Aggressiveness, which measures how often a quarterback throws into a tight window of one yard of separation or less, Average Time to Throw (TTT), which measures the time from snap to throw for each quarterback, Average Intended Air Yards (IAY), which measures how far the ball travels from where the quarterback threw the ball to its intended landing spot (unlike Air Yards, which measures from the line of scrimmage), and Air Yards to the Sticks, which measures how far away each pass is from the first down marker.

Here’s how those metrics correlate to some more traditional statistics that matter — completion percentage, touchdown rate, interception rate, and yards per attempt — among 80 qualified quarterbacks over the past two seasons.

There’s not a lot of meaningful correlation here. The highest correlation is the negative one between Intended Air Yards and completion percentage, which makes sense — the further the ball is thrown down the field, the less likely it is to be completed. In varying degrees, all of the Next Gen Stats had a negative correlation with completion percentage and of the Next Gen Stats, Aggressiveness is the only stat that correlates with a negative outcome in each category.

31 quarterbacks qualified in both 2016 and 2017, which gives us a small sample for year-to-year correlation. In that group of quarterbacks, IAY had a year-to-year correlation of 0.42 (0.18 r-squared, meaning 18 percent of IAY could be predicted by the previous season), TTT was 0.42 (0.17), AGG was 0.33 (0.11), and AYTS was 0.32 (0.10). However, among this group all of those figures beat out the traditional statistics like YPA (0.31, 0.10), INT% (0.17, 0.03), and TD% (0.07, 0.00).

Rushing

There are three Next Gen Stats categories for running backs — Efficiency (EFF), which measures how many yards a back runs per positive yard gained (think an open run up the middle vs a long developing stretch play to the outside), Time Behind the Line (TLOS), which takes the average time a running back spends behind the line of scrimmage on each handoff, and 8-man Box Percentage (8M%), which measures the percentages of rushes a back faces 8 or more men in the box. Some of these are self-explanatory, but we’ll explain anyway.

We can take a look at how these stats work with each other as well as yards per carry (YPC). Yards per carry isn’t the greatest stat to judge running backs, but for now it works with what we’re trying to accomplish here.

The one thing that stands out is how much Efficiency correlates with yards per carry. In this case, negative correlation is good because the lower the Efficiency number, the better (fewer yards run per yard gained). The minus-0.72 correlation between the two is the most significant among what was tested throughout all the Next Gen Stats. What makes that finding even more interesting is that among a small sample of 27 running backs who qualified both years, Efficiency had a higher year-to-year correlation (0.43, 0.19 r-squared) than yards per carry (0.35, 0.12). This could mean Efficiency has the potential to be a slightly better metric for predicting future performance from running backs than yards per carry.

Both TLOS (0.61, 0.37) and 8M% (0.45, 0.21) had fairly significant year-to-year correlations among this group. Running backs, of course, had the smallest sample of qualified players over the two seasons.

Receiving

This, it appears, is where the money is. Next Gen Stats for receivers include Cushion (CUSH), how far away the closest defender is off the line, Separation (SEP), how far away the closest defender is at the target point, Targeted Air Yards (TAY), how far away the receiver is from the quarterback on a pass, and Targeted Air Yards Percentage (TAY%), which is the percentage of team’s air yards a receiver sees.

Let’s dive into some of the data, which also includes targets and receptions among a group of 190 receivers over the past two seasons.

The highest correlation here is between Cushion and Separation, which makes sense. The further off a defender plays a receiver, the more space a receiver has to work, especially given the next point — Separation and Targeted Air Yards are negatively correlated. That means more separation comes on shorter passes. There’s a lot of logic to that, though admittedly I always picture someone getting open deep down the field whenever we talk about “creating separation.” The reality is most of the separation happens much closer to the line of scrimmage.

With that in mind, it’s also interesting that Cushion has a negative correlation with both targets and receptions. While more cushion leads more room to work shallow, playing off a receiver is typically in preparation for him to go deep where the probability of a catch falls.

The wide receiver stats were also the stickiest for year-to-year correlation among the group of 67 receivers who qualified in both 2016 and 2017. TAY had the highest correlation at 0.81 (0.66 r-squared), followed by CUSH (0.74, 0.55), TAY% (0.61, 0.37), and SEP (0.60, 0.36).

There’s still a lot of work to be done with Next Gen Stats. This testing helps show some can be useful in predicting what will happen while some stats are just more descriptive. Hopefully as more stats are released and we get more years of data, more work can be done here and the uses for these numbers can grow instead of hidden away on on an oft-overlooked section of NFL.com.

--

--