NBA Neural Networks: Quantifying Playoff Performance by Number of Games Won and Seeding

Published in

re-HOOP*PER-rate

7 min readAug 21, 2020

In my NCAA postseason March Madness neural networks, I use a metric known as “Wins Above Seeding” to determine how well a team will perform relative to its position within the playoff bracket. At its core, the “Wins Above Seeding” is a valuation metric that determine whether a team is over-valued or under-valued relative to their position within the bracket. It was inspired by the Warren Buffet / Benjamin Graham value investing approach that is constantly on the lookout for undervalued equities to invest in and overvalued equities to avoid. In this case, the neural network is trained using previous seasons’ team statistics and “Wins Above Seeding” value data. Based on the patterns found by the neural network, the teams for the current season’s NCAA postseason are evaluated.

I’ve always wanted to adapt this approach to the NBA postseason, but there’s an extra complication there: unlike the single elimination NCAA March Madness, the NBA playoffs consist of 4 rounds of best of 7 series. In other words, an effective NBA playoff prediction tool must tell us not only which teams will win and advance, but also how many games the team will play before advancing. This is information that isn’t captured by a simple “Wins Above Seeding” metric. In fact, compared to the wild upsets in March Madness, the 7 game series format insures that upsets rarely occur in the NBA playoffs. Instead, participants in bracket challenges like the NBA Pick ’Em Bracket Challenge gain points on the competition by predicting the exact number of games in a series (i.e. how many games are played before one of the teams reaches 4 wins). Instead of “Wins Above Seeding”, my neural network approach would require a new metric that captures both which teams advance, and how many games that team takes to advance — all while accounting for the team’s 1-through-8 playoff seeding.

After playing around with the numbers, I found a relatively good way to quantify this seeding adjusted NBA playoff performance using the number of games a team wins in the postseason. After all, the number of games a team wins is a direct reflection of how far a team advances — a team that advances past the first round wins at least 4 games, a team that advances past the second round into the conference finals wins at least 8 games, and so on until a champion is crowned after winning 16 games. As legendary Phildaelphia 76er Moses Malone once put it, winning an NBA championship comes down to “Fo’, Fo’, Fo’, Fo’”. My new metric, which I call LOgarithmic Victories-Scaled-Seed (LOVSS), is defined by:

LOVSS=floor(log-base 2 (number_of_wins * (seed+1)))

where number_of_wins is the number of games the team wins in the postseason, seed is the team’s seeding within its conference between 1 and 8 (where 1 is the best regular season team and 8 is the worst), and the floor operation returns the highest integer value less than or equal to the result of the log base 2 operation. One interesting wrinkle here is adding 1 to a team’s seed before multiplying by the number of wins to calculate the LOVSS. I did this to achieve a nice symmetry: an 8 seed that upsets a 1 seed in the first round but doesn’t win any more games will have a LOVSS of:

LOVSS=floor(log(4*(8+1)))=floor(5.17)=5

while a 1 seed that wins the championship will have a LOVSS of:

LOVSS=floor(log(16*(1+1)))=floor(5)=5.

I used a Python scraping script to get historical data on LOVSS for every NBA playoff team from the 2013 to 2019 seasons (ignoring the lockout shortened 2012 season and all the seasons before it). I also leveraged the data from my previous NBA restart training dataset (which includes each team’s season average ranking relative to the rest of the league in Three Point Percentage, Two Point Percentage, and Rebounds /Assists /Steals / Blocks /Turnovers / Points per 100 possessions). The result looked something like this:

To understand the above data, Miami won the championship in 2013 (a LOVSS of 5 for LeBron and Dwyane Wade’s first seeded Heat team), while being 2nd in the league for three point percentage, 1st in two point percentage, 16th in free throw percentage, and so on and so forth. Meanwhile, their first round opponent, the Milwaukee Bucks, have a LOVSS of 0, having won no games at all that postseason.

After that, I grabbed data the same per 100 possessions data for the playoff teams this season. I trained a neural network with 3 hidden layers in Keras using the historical playoff data (that’s very similar to my March Madness Neural Network MadNet):

Neural Network with 3 hidden layers and Dropout

using a softmax activation layer (note I had to convert all of the LOVSS data to a onehot representation before I did this) and adding regularization and dropout after each layer to prevent overfitting. I found the “elbow” of the training curve after around 128 epochs of training:

so that’s what I used to obtain my results. I ended up training the neural network over 20 times on this data, and using each of these neural networks to calculate a LOVSS score based on 2020 NBA data. Initially, I used data that didn’t include stats from the bubble restart, since 1) I had that data handy from my previous NBA neural network and 2) I wanted to see how that compared to the results when NBA restart stats are included, to see which teams gained momentum from playing in the NBA bubble.

To identify how well a team would do after the restart, I used the average LOVSS value predicted by the neural networks and reversed the equation shown above (i.e. instead of taking a logarithm, I calculated 2^LOVSS and then calculated the number of games won). I also took into account historical data regarding how a NBA playoff series between 2 teams with specific seedings and LOVSS metrics evolved. Here’s the bracket based on the LOVSS predictions based on the pre-bubble NBA stats:

Neural Network generated NBA Playoff bracket based on pre-bubble data

As we can see, the neural network is hype on the Bucks — as everyone was before the bubble started. As for the Lakers, even before their restart bubble woes the neural network didn’t particularly like their team profile. The neural network didn’t particularly like the Raptors or Sixers either, and expected Denver to give the Clippers a really competitive second round series out in the West.

Now let’s see how using data that included stats from the NBA restart bubble would shake things up:

Neural Network generated bracket with NBA Bubble data

Well, that certainly changed things! The Bucks’ bubble troubles have them losing to the surging Heat in the second round, en route to their first Finals appearance since the LeBron era. Meanwhile, the Clippers are now expected to make it out of the West and win the championship with relative ease. The Lakers are now expected to be pushed to the brink by the Blazers in the first round (before eventually defeating them in a close game 7 — a likely scenario given the NBA refs’ noted favoritism towards both LeBron and the Lakers) before again succumbing to the Rockets in the second round. This bracket is really interesting and actually conforms with what I’ve been seeing with the “eye test” watching games during the restart. In particular, it has the Heat making an upset run, a real possibility given Bam Adebayo’s emergence as a cerebral big man who always seems to make the right play and Duncan Robinson’s emergence as the fastest-trigger 3 point shooter in the league (yes, his release looks like it may be faster than Steph Curry’s, though he obviously doesn’t have Steph’s unstoppable versatility). That’s before we talk about veteran stars like Jimmy Butler, Goran Dragic and Andre Igoudala who round out the Heat roster. Meanwhile, if any team can counter all of the perimeter weapons offensive-minded teams like the Heat or the Rockets can throw out there, it’s a Clippers team stacked on the wings with versatile defenders like Patrick Beverley, Kawhi Leonard, and Paul George (and oh yeah, Kawhi and Paul George are pretty much unstoppable on offense also). In short, I really like how this bracket plays out!

Out of pure curiosity, I adjusted this bubble restart bracket based on a “momentum factor” in which I took the difference between each team’s post-bubble LOVSS score and their pre-bubble LOVSS score, and added it to their post-bubble LOVSS score. The main difference in that bracket is that it now has the surging Blazers upsetting the Lakers before pushing the Rockets to a game 7, and now has a scorching hot Heat team upsetting the inconsistent Bucks in just 4 games.

Given the tedium of quarantine, I’m incredibly glad the NBA playoffs are there to keep me sane. Here’s hoping the playoffs evolve exactly like my predictions — after all, there’s a 1 in ~35 trillion chance I’ll get every pick right and win the NBA’s million dollar grand prize!

NBA Neural Networks: Quantifying Playoff Performance by Number of Games Won and Seeding

Written by reHOOPerate