‘Stats-ing’ With Steve, Part I: Corsi, the Beginning of It All

Explaining one of the most relevant, advanced statistics in hockey

image via The London Film Review

Advanced metrics in hockey have more or less exploded onto the scene. Teams (the smart ones, anyway) are hiring folks who are responsible for assessing the game in a statistical and analytic fashion, and they’ve become so popular that even journalists and beat writers have had no choice but to learn and cite them to assess team and individual performance.

Numerous statistics are publicly available for anyone’s eyes, but when you visit sites such as the excellent corsica.hockey (by Emmanuel Perry), you’ll notice that there are over forty columns of different statistics to look at. If you haven’t looked into analytics before, it can be difficult to try to determine what you should focus on.

Which brings me to the point of these articles: to show which metrics are most predictive of future success and/or failure. Teams can rattle off a string of wins and the casual observer might think the team is genuinely good, when underlying numbers indicate the wins came as a result of luck (and we’ll get into how to measure luck in a later post.) Often, there’s a lot more going on under the surface of wins, losses, goals, and assists.

Corsi was the first statistic of this kind to be tracked, back in 2007–08. It began when Vic Ferrari (an Internet pseudonym, FYI) heard then-Buffalo Sabres GM Darcy Regier talking on the radio about shot differential. He decided to develop a formula for properly calculating shot differentials, counting all shot attempts: those on goal, those that missed, and those that were blocked.

Rumor had it that Ferrari had heard Jim Corsi, then Buffalo’s goalie coach, speaking about shot differentials and that was why the stat was named for him. Turns out, Ferrari just liked his ‘stache. Ironically enough and unbeknownst to Ferrari, Jim Corsi had already had the idea to count all shot attempts, which is where Regier likely got his talking points. What a world!

For practical use, we convert Corsi into a percentage — specifically, Corsi For % (CF%). As its name indicates, we calculate it as Corsi For divided by total Corsi for which a player is on the ice. This percentage is typically the quickest way to determine if a player or team has been performing well: 50% is exactly break-even. Anything above 50% means that when the player is on the ice, his team attempts more shots than it surrenders, and vice versa.

You can view the CF% column to judge both an individual player and a team. We use CF% because it is less susceptible to being affected by luck; that is, it is typically a little more stable, over time, than something like goals or points — which makes sense: there are way more shot attempts in a given game or season then there are goals or points, increasing the sample size and giving us a better idea of what’s actually going on. CF% is more reliable when it is measured over a larger data set; game-to-game, CF% can fluctuate pretty wildly, but by the twentieth-or-so game, it should stabilize enough to become useful. Single-game CF% can be helpful to determine how a game went for a team or player, but shouldn’t be used to make final judgments about a player or team.

A few years ago, Micah Blake McCurdy determined that score effects were not accounted for when using Corsi — when a team is losing, they usually “turn it on” to try to make up the difference in the score. In addition, the home team was found to outshoot their opponent more often than the road team. Testing his theory, he found that adjusting for score and for venue were more predictive than “raw” Corsi, and on many stats sites (like Corsica), there are options to score- and venue-adjust every stat.

It isn’t to say that goals aren’t important and shouldn’t be considered in evaluation, either — just that they can fluctuate from year to year if a player gets hot or goes cold, and may not capture the full scope of what’s occurring on ice. For example, to the casual observer, Calgary Flames winger Johnny Gaudreau might be struggling due to his lower-than-normal goal and point totals this season. But in reality, he has been snakebitten, and has actually seen his CF% increase this season from last. Sure, goals and points are the desired result, but you can’t say the process is failing in this example.

For what it’s worth, Fenwick (and Fenwick For %, FF%) is a similar shot differential metric to Corsi, calculated with the same method, but it leaves out shots that are blocked. It gives a good idea of who is blocking more shots, but Corsi is typically viewed as a better proxy for useful possession, as it is more predictive of future success. It can be score- and venue-adjusted like Corsi, but isn’t found to be as predictive.

Alright folks, it was fun ‘stats-ing’ with you. Look out for future articles in the “‘Stats-ing’ with Steve” series.