Imagining the NHL’s 2019–20 season without COVID: Simulating the cancelled games and resulting playoffs

Meg Ellingwood
Kenyon College Sports Analytics
14 min readFeb 10, 2021

Meg Ellingwood

Photo from CNN

In March of 2020, with the coronavirus pandemic beginning to swell, the NHL made the unprecedented decision to pause its season out of safety concerns to protect both the teams and the general public. It was unclear at the time when the season might resume, but the league ended up not playing the cancelled games and instead returning to play within strictly controlled bubbles with an altered playoff format.

A question that has come up in many different areas of life since the start of this pandemic is what might have happened over the course of this year if it were not for the virus. How might the season have played out if the NHL had not been forced to cancel the rest of its season? (To be clear, I am not saying that the league should have continued to play. I think they did the right thing, but it is interesting to speculate.)

Answering this question proves a bit complicated, since when the season was paused, 189 games were left to be played out of the season’s full 1271 games. The strict cutoff date meant that different teams had played different numbers of games so far in the season, creating challenges for calculating rankings somewhat difficult and complicating the comparison of season stats. In a normal season, each team’s schedule is determined by setting up a given number of within-division games, a given number of within-conference outside-division games, and a given number of outside-conference games. Different teams had games from different categories left to play, and each team had between 10 and 13 games remaining to be played.

I set out to simulate the end of the regular season by defining each matchup left to be played and determining the winner by generating each team’s goals scored and comparing them. This would give me insight into how the regular season might have been played out if the league had not shut down in March.

However, I also wanted to see how the playoffs might have gone if the normal situation had occurred, with the teams who made it being determined by regular season standings, and going directly to the playoffs as soon as the regular season ended. How much did the altered format of the play-in round change the results? How did the several-month gap between the abrupt end of the season and the start of the playoffs affect the outcome? Would the Tampa Bay Lightning still have come out on top if this year had been a normal one? The answers to all of these questions might not be specifically teased out by a simulation, but I could still get a sense of what might have happened.

Procedure and Results

The results of the portion of the regular season that was played (i.e. prior to March 12, 2020) were obtained from Hockey Reference, while the planned schedule for the portion of the season that was cancelled was obtained from ESPN’s individual team pages. In all, 189 names were cancelled for the end of the 2019–20 season, out of a total of 1271 total planned games.

Initial goals-per-game values were calculated based on the games played in the regular season. These values were also corroborated by comparing with the Hockey Reference calculated goals per game for the regular season, and a dataframe was also assembled with each team’s goals in each of the games they played so that running averages could be calculated when additional games were added via simulation.

Then, the planned end of the season was simulated by looping through the cancelled games and simulating the goals scored in each game. For each matchup, the home and away teams were extracted from the schedule, and their corresponding goals-per-game values were calculated based on their games played, both real and simulated, up to that point. Goals in the simulated game were generated with random numbers from a Poisson distribution with lambda equal to the respective team’s average goals per game. If these simulated values were the same for both teams in the matchup, a Bernoulli trial with probability of success of 0.5 was used to determine which team got the extra game-winning goal in overtime or a shootout.

The results of the simulated game were put into the schedule as if the game had been played, including a note of whether or not it went beyond regulation time, and each team’s goals were added to their running totals. In this way, teams’ goals-per-game were updated with each simulated matchup, just like they would be in a real season.

The number of points obtained by each team based on their wins and losses was then calculated. This was done by assigning two points to the winning team, as well as assigning a single point to a losing team if the game was tied at the end of regulation time.

I also ranked the teams to see where they finished in relation to the whole league, but this procedure glossed over some nuances in the real NHL rules. In reality, rank is first determined by team points on the season, and ties are by comparing results on several different statistics. When computing rank with R, though, I was not able to get these tie-breaking factors, so ties in team points were broken arbitrarily, with the team whose name was earlier in the alphabet getting the better rank. This is not the best way to handle this situation, as it gives an unfair advantage to teams like Arizona, which would be consistently ranked higher in tie-breakers, probably undeservedly. This ranking simplification also extended to rankings within divisions and conferences, which were used to determine playoff slots.

Beyond just determining team performance in the regular season taking into account the simulated end of the season, though, I also wanted to simulate the results of the playoffs and see how they compared with the playoffs that really occurred when the league returned to play in their bubbles. However, this was quite a challenge because of the way that the playoffs work, and how matchups are assigned.

Within each conference, the top three teams in each division make the playoffs, and then there are two wildcard slots for the fourth and fifth ranked teams regardless of division. Eight teams from each conference ultimately make it to the playoffs, and from there the matchups are determined. The second and third ranking teams in each division play each other in the first round, but the matchups between the top team in each division and the wildcard teams are determined by rank within the conference. The top team in the conference plays the lower-ranked wildcard team, and the other division-leader plays the higher-ranked wildcard team. Once these matchups were determined, simulation of the playoffs themselves could begin.

For each series determined by the matchups set up above, game results were simulated based on goals per game for each team. In a given game, each team’s goals per game was calculated as a running average just as in the regular season simulation, and their goals in that individual game were generated from a Poisson distribution with that average as lambda. The goals scored were compared, and once again ties were broken with a “coin flip” (Bernoulli trial with probability of success of 0.5). The winner of the game was recorded, and as soon as one team accumulated four wins, the series ended and they were recorded as the winner.

Moving into the second round, the winners from the first round were paired into new matchups. The two remaining teams who had represented each division (i.e. the winner of the 2 vs. 3 game, and the winner of the 1 vs. wildcard game) were matched up for a second round series, which was then simulated in the same way as the first round series.

For the third round, which is also known as the conference finals, the two remaining teams from each conference were paired up, and the series was simulated the same way as before for each matchup.

Finally, the winners of the third round were matched for the final round, and the same simulation was conducted to determine the winner of the Stanley Cup. In this simulation, the Final matchup was between Minnesota and Boston, and Boston came out on top to be the Stanley Cup Champion. The full bracket for the playoffs is shown below.

Figure 2. Playoff bracket and results from initial simulation of the end of the season and playoffs.

Now that I had a procedure for simulating the end of the season and the playoffs, I did not just want to run it once. I needed to run it many times to see the full range of possibilities of outcomes. Specifically, I wanted to estimate the probability of Tampa Bay winning the Stanley Cup, but I also wanted to see the performance in the regular season for all the teams, and look at how frequently different teams made the playoffs and how far they made it.

I built a loop to run my simulation 10000 times. For each iteration, I extracted the team points on the season and ranks for the regular season, including the simulated games, so that I could build graphs of how well the different teams did over the many simulations. The boxplots showing regular season performance are shown below, split by division.

Figure 3. Distributions of regular season performance by team

These boxplots show the variability as well as central tendency of the distributions of outcomes for each team in the regular season. There seems to be less variability in end-of-season results than might have been expected, but this does make sense in light of the proportion of games that remained to be played. Each team was beginning from a set starting point and only adding a bit of variability from there. I do think it is interesting to see the differences in performance between the different divisions, with Boston in the Atlantic division getting up to about 125 points on the season, which is almost unheard of, while the top team in the Pacific division, Vegas, not even reaching 110 points in its best season. I think this is something that fans definitely notice, especially going into the playoffs, with teams in certain divisions having to fight harder for playoff slots than teams in weaker divisions.

For regular season performance, I also examined the frequency of each team winning the Presidents’ Trophy, which is awarded to the team with the most points at the end of the regular season. The number of simulated seasons is taken out of the 10,000 iterations.

Figure 4. Frequency of ranking first overall in points in the regular season by team

Figure 4 is quite interesting because it reveals that there is less variability in the results of the regular season simulation than I would have expected. Reflecting on this, though, there were only about 10 games (up to 13) remaining in the season at the time of the cancellation. This means that teams had already earned most of their points for the season, and standings were relatively set by that time. Examining the points and standings as of March 11, the day before the suspension of the season, Boston was at the top of the league, with 100 points, while the number two slot was occupied by St. Louis, with only 94 points. Judging from this, Boston was poised to win the Presidents’ trophy as of mid-March, regardless of the outcome of the rest of the season. Thinking this way, it is less surprising that there were only a few iterations of the simulation in which other teams ranked first overall rather than Boston. The other teams that ever made it to that spot were St. Louis, Colorado, and Tampa, which were the top four teams at the time of the season suspension. In conclusion, we can be relatively confident that Boston would have finished the season at the top of the whole league, positioning themselves well going into the playoffs.

Moving into assessing the playoff simulations, I wanted to see how frequently different teams made the playoffs, and how far they made it through the rounds. I created a graph counting the number of simulations in which each team made it to the first round, second round, third round, and fourth and final round, as well as how many times they won the Stanley Cup. Counts are once again out of 10,000 simulated seasons

Figure 5. Simulated frequency of making the playoffs and frequency of playoff outcomes by team.

Note: For the teams that only infrequently made the playoffs (e.g. Montreal, New Jersey), the points for earlier rounds are masked by the points for later rounds, making it appear that these teams made later rounds more frequently than they made earlier rounds, which of course is not the case.

This plot shows quite a bit more variability in outcomes than the results for the regular season did, probably because many more games were simulated, between all the series in each round. Some teams made the playoffs just about every simulated season, like Boston and Colorado, while others only made it a few times, like Arizona. Only 26 teams ever made the playoffs in all of the simulations, so there were 5 teams that never made it, including Anaheim and Detroit.

While the plot is very interesting, showing the full range of outcomes, I also wanted to look closer at the frequencies of winning the Stanley Cup. The team with the most Stanley Cup wins over the 10000 simulations was Washington, with 1050 wins, for an estimated probability of winning the Cup of 0.105. They were closely followed by Colorado, with 1032 wins, and Tampa, with 1001 wins. Interestingly, Boston was the ninth most likely to win the Cup, with only 541 wins out of 10000 trials. This could be an artifact of the way the simulation was conducted, as Boston did not have as many goals per game as some other teams, with only 3.3 goals per game in the initial simulation, while Tampa and Toronto each had 3.6. However, even if it is an artifact, it does do a good job reflecting the upsets that frequently occur in the playoffs: fans know that just because a team placed first overall in the regular season, that does not mean that they are guaranteed to win the Stanley Cup.

Discussion and Conclusions

After building a simulation to determine the the results of the end of the 2019–20 season that was cancelled due to the Coronavirus pandemic and to determine the results of the playoffs that would have occurred afterward, I found that the results of the regular season were close to set at the time the season was terminated. My simulation of the regular season showed that Boston was likely to maintain their top-of-league rank to the end of the season, but the playoffs were a different story.

There were many more possible outcomes of the simulated playoffs, with 24 teams winning the Stanley Cup in at least one of 10000 simulations. Boston, the regular season leader, only had an estimated probability of winning the Cup of 0.0541, while Washington was the most likely to win. However, the simulation also showed that the results of the true playoffs that occurred in the bubble in the late summer were not unheard of. In the simulation, Tampa, the winner of the real playoffs, was the third most likely to win, with an estimated probability of about 0.1. If the season had been allowed to continue normally (and again, I’m not saying that it should have, only speculating about what might have been), Tampa would still have had about a 10% chance of winning the Cup, which is pretty good considering the playoffs can often be full of upsets and unpredictability.

However, the work does have some limitations and things that could have been done differently. As discussed in the methods section, rankings were a bit problematic in that ties in rank were broken by the alphabetical order of the team names. This probably did not affect the top and bottom of the rankings, teams like Boston and Detroit, but it might have introduced bias in the middle of the pack, if Carolina was consistently and falsely ranked above New Jersey, for example. At the same time, though, resolving this issue would have involved incorporating the factors that are really used to break ties in the rankings, such as goal differential and regulation and overtime wins, which would have added another layer of complication that is probably beyond the scope of this project.

Another thing that might have been done differently is that each series in the simulated playoffs did not record how many total games were played. This is somewhat realistic because in the real playoffs it makes no difference if a series is a sweep or if it is dragged all the way out to seven games. All that matters is who is the first to four wins, but it is interesting to see how many games were required for each series, and it might have been good to save that information in the simulation, possibly to see how closely matched the teams were, or how exciting the playoffs might have been to watch.

Having built this model to simulate the results of a handful of games, I would like to turn it into a model that would simulate a whole season. It would start out with average goals per game determined by the previous season, but as simulated matchups went on, the goals per game stats would be modified as a rolling average. This would simulate the results of a whole season before the games had even begun, but more importantly, I would like to plug in the results of real games as they went on to improve the model with real results going forward. I think this would be a very useful extension of the work done here.

References and Data Sources

ESPN Enterprises, Inc. (2020). Arizona Coyotes Schedule 2019–20 [ESPN page]. Retrieved from https://www.espn.com/nhl/team/schedule/_/name/ari/seasontype/2 [accessed equivalent pages for all teams in the NHL by replacing team name acronyms in URL]

NHL.com (2020). NHL to pause season due to coronavirus [NHL.com news article]. Retrieved from https://www.nhl.com/news/nhl-coronavirus-to-provide-update-on-concerns/c-316131734

PrintYourBrackets.com (2020). Fillable NHL Playoff Bracket [pdf]. Retrieved from https://www.printyourbrackets.com/images/fillable-nhl.pdf

Sports Reference LLC. (2020). 2019–20 NHL Standings [Hockey Reference page]. Retrieved

from https://www.hockey-reference.com/leagues/NHL_2020_standings.html

Sports Reference LLC. (2020). 2019–20 NHL Schedule and Game Results [Hockey Reference page]. Retrieved from https://www.hockey-reference.com/leagues/NHL_2020_games.html

--

--