March Madness Neural Network specifically for Bet Sizing

reHOOPerate
re-HOOP*PER-rate
Published in
3 min readJan 27, 2020

When I first created a neural network for March Madness predictions back in 2018, I treated it as a two class classification problem that resulted in a bracket with none of the Final Four picks correct (though it did correctly predict some huge upsets, like Buffalo over Arizona and Nevada over Cincinnati). At the same time, I also created a multiclass classification neural network which at least correctly predicted that third seeded Michigan would make the Final Four and Championship Game (though it missed all of its other Final Four picks). As a result, when it came time to re-work my original neural network in Keras with updated and cleaned up data, I decided to use a multiclass classification with ReLU activation.

However, my binary classification algorithm in TensorFlow was very effective at picking early round upsets, giving me very strong indications of a 13–4 upsets in both 2018 and 2019. In particular, running a binary classification algorithm multiple times can gives me clear statistics on how “strongly” the neural network thinks a given team can pull off an upset, as well as how likely a team is to be upset. Simply look at the teams that consistently show up with a high WAS, and see when they match up with teams that consistently show up with a low WAS. Taking the lower seeded teams average WAS and subtracting the higher seeded team’s average WAS even gives a numerical value corresponding to how much “confidence” to put into the given upset occurring.

In fact, this binary classification neural network came back to me recently as I was reading Dr. Marcos Lopez de Prado’s excellent book, Advances in Financial Machine Learning. Chapter 3, section 6 of the book discusses “meta-labeling” as a way of sizing bets on Wall Street. In short, Dr. Prado refers to a meta-labeling machine learning model as a model specifically used to size bets (as opposed to another ML model that’s used to decide on which side of the bet to take). The idea makes a lot of sense for the March Madness neural network also — one model (the MadNet multi-class classification neural net) is used just for determining which upsets to pick, and another model (which, taking after Dr. Prado term meta-labeling, I’ve decided to call “MetaNet”) is used to determine how much I will bet on a given upset in Vegas (where I’ll be headed the first week of this March Madness to try my luck at the sportsbooks). In fact, my current MetaNet implementation trains this binary classification neural network many times, and then predicts the results, storing each prediction into an array:

Meta-labeling neural network being run multiple times

Then I examine the results from multiple runs to determine which teams consistently have high Wins Above Seeding, which teams consistently have low Wins Above Seeding, and even the variance of these teams. I’m looking forward to trying MetaNet (in conjunction with MadNet) out in Vegas, and will be updating this blog before my trip with the bets I expect to make, so stay tuned for that!

The best place to spend March Madness!

--

--