Experimenting to Gain Insights from BoxNet, the convolutional neural network for March Madness

reHOOPerate
re-HOOP*PER-rate
Published in
5 min readMar 1, 2020

Previously, I wrote about using Genetic Algorithms with the multi-class classification neural network MadNet to glean insights about what kind of teams perform well in March Madness. Now let’s do the same with BoxNet, the convolutional neural network that I applied to the average “box scores” showing the average points per game, rebounds per game, steals per game, blocks per game and turnovers per game for each player on the team. Essentially, my BoxNet code treats this data as a “picture” and finds patterns within that picture — some patterns correspond to overperformance relative to a team’s seed and some patterns correspond to underperformance relative to the team’s seed. For more on how I use a “Wins Above Seeding” metric to indicate how well a team will do in March, take a look at my original March Madness neural network blog post.

Once I had a BoxNet convolutional neural network trained on the historical data, I used a random algorithm to generate “box scores” with randomly generated data on each player’s average stats. Then I normalized those box scores (as I do with the original BoxNet data) and ran the teams through BoxNet, taking either the best performing teams or the worst performing ones. Note that I didn’t bother using a genetic algorithm this time, since basketball box scores are so complex that the new box score “chromosomes” that are bred from existing box scores don’t necessarily correspond to better performance. In addition, note that since the data is randomly generated, certain teams may have deeply unrealistic statistical profiles. I decided it would be too difficult to add “realism constraints” to the randomly generated box score profiles, so I went ahead with full randomization for these experiments.

Out of 1000 randomly generated team profiles, only 2 had a Wins Above Seeding greater than 2. The first overperforming team had a dominant forward or big man who leads the team in both points and rebounds, maintains a below average turnover rate, and has a fair share of passing duties as well. The team also has a dominant guard who leads in assists and steals while scoring almost as much as the big man, while maintaining a low turnover rate and taking care of some rebounding duties. The team also has 2 more scoring contributors who are above average defensively, similar to the “three-and-D” wings we see in real life. This profile seems to fit a classic dominant basketball team with talent all around, and some of the best examples of this prototype include National Champion Duke in 2015 (with Jahlil Okafor as the dominant big, Tyus Jones as the star point guard, and Grayson Allen, Matt Jones, and Justise Winslow as the wings), or National Champion Kentucky in 2012 (with Anthony Davis as the dominant big man, Marquis Teague at point guard and players like Doron Lamb and Terrence Jones filling out the wings).

Another prototypical dominant / overperforming team features an all around player who efficiently gets lots of rebounds and assists (a Draymond Green / Denzel Valentine type that seems to be common at Michigan State), a do-it all guard who scores, rebounds and passes well but suffers from being turnover prone, 2 supporting scoring guards who also defend well (those three-and-D wings again). All the players on this type of team pass at greater than 50% relative to the team’s best passer, indicating that there is no dominant ball handler and lots of versatility across the board. Examples of National Championship teams with a profile like this include UConn in 2014 (with Shabazz Napier as the do-it-all guard and Ryan Boatright as the do-it-all forward in the Draymond Green mold), Villanova in 2018 (with Mikal Bridges as the versatile forward and lots of guards who could shoot and pass), and to a lesser extent UVA in 2019 (with DeAndre Hunter playing the versatile forward role, Ty Jerome playing the do-it-all guard role and Kyle Guy single-handedly playing the role of multiple scoring guards with his accurate high-volume shooting).

Source: https://commons.wikimedia.org/wiki/File:20170213_Villanova-Depaul_Mikal_Bridges_dunk.jpg

I did the same thing with teams that underperform relative to their seed, and out of 1000 randomly generated teams, 3 of them had a Win Above Seeding of less than -2 (corresponding to a first or second seed being upset in the first round). What are these teams that are so easily upsettable like? The first team had a player leading in scoring and rebounding while also having a very low (far below the team’s best passing player) assist rate. This team also had a leading assist player who was very turnover prone. To some extent, this corresponds to college basketball teams with a single traditional “back to the basket” big man. Teams like this have a history of underperforming in March Madness (neither Shaq nor Tim Duncan ever made it past the second round of the tourney). A more recent team that fits this profile is UVA in 2018, a team on which Isaiah Wilkins grabbed many rebounds while doing very little passing, and on which Ty Jerome had a very poor assist to turnover ratio (3.9 assists per game versus 5.5 assists per game the next season, with a 3.9:1.6 assist-to-turnover ratio).

The next under-performing team profile had a leading scorer who was also a leading rebounder but who rarely passes the ball, a second top scorer who doesn’t rebound much or get many assists, and no other standout players. Again, we see that teams relying too heavily on a traditional low post big man don’t do well in March. The closest prototype for this team may be the second seeded Georgetown team that was upset by Florida Gulf Coast University in 2013. That team featured a big in Otto Porter who wasn’t much of a passer in spite of the team’s high assist rate, and the team’s second leading rebounder Markel Starks barely rebounded.

The last under-performing team profile had a top scorer and rebounder who was very turnover prone (again indicative of a traditional big man and even sounding like DeAndre Ayton, whose fourth seeded Arizona Wildcats were upset by Buffalo in the first round in 2018). On the rest of the team, no single player doing significantly more than all of his teammates in points, steals, rebounds blocks and assists. Meanwhile, all of the players are very turnover prone. This team looks a lot like the second seeded Duke team that lost to CJ McCollum and Lehigh back in 2012: lots of scorers with no clear cut go-to player and a high turnover rate among every member of the team.

It seems like if there’s one thing that’s consistent across the five team box score profiles that significantly under or over-performed in March, it’s that versatility is important. The best teams have big men who can pass, and guards who can rebound. On the other hand, teams that get upset have big men who can only score and grab boards, and guards who can only shoot but don’t pass (or are very turnover prone when they do pass). When it comes to March Madness, versatility truly is the key to a deep run through the tourney.

--

--