NBA Hackathon Rundown
School is back in full swing, and I usually end up keeping the weekends fairly busy so my post rate has slowed down, but I’m going to try to keep updating semi-regularly. Kind of a shame but such is life. I’m writing today because I wanted to give you a run down of the NBA Data Analytics Hackathon, which happened on September 24th, 2016. I honestly had a great time and I thought it would be cool to walk you guys through it.
If you haven’t read my past post about the Hackathon, where I described the application process a bit, check it out here. I know there are some glaring typos, but I was under pretty heavy time and character constraints when writing the essay.
A couple weeks after I wrote that essay, I got the following email:
which left me feeling pretty excited. I obviously immediately said yes, and I definitely wanted to work with a team, so I agreed to that too.
A week or two after that, I got some notification of the people with whom I would be working. We didn’t know each other when we began, but I was definitely really glad to have worked with them — all three of them were really smart people and really made the experience very enjoyable. We took a group photo at the end, which I’ll show here:
We also got a heads up that we would, for a limited amount of time, have access to some top-secret confidential super data that the NBA keeps on the hush. The data ended up being pretty extensive information about the (x,y,z) coordinates of players and the ball at every instant during the playoffs, the number of dribbles a player took, the speed of a player, the distance they traveled, the defender distance, the distance from the hoop, and other really granular data that makes any data nerd squeal in excitement.
With these things in mind, the Data Analytics Hackathon slowly approached.
So the day started for me at approximately 4:30 AM because I needed to catch the 5:27 AM train to NY Penn Station from New Brunswick. It’s about a 68 minute ride, so I arrived in NYC at around 6:35 and ended up sitting at Starbucks for about an hour waiting for the whole event to start. While I was waiting there, I met with Andrew, the guy standing at the podium in the above picture and we talked for a little bit. As it turns out, he spotted my Reddit post about the Hackathon a while back, so hopefully, he spots it again. Hi Andrew!
Once we were in the actual Hackathon, we were addressed by a number of pretty cool people, including Adam Silver, the NBA Commissioner, and John Starks, a former Sixth Man of the Year Knicks player. While I don’t feel as though any of them said anything particularly awe-inspiring, it was pretty impressive that they came out to talk to us. After they finished addressing us, they gave us our prompts.
Initially, we had considered trying to develop a method to visualize defensive switches in the data, but after that proved to be a little too difficult, we scrapped it. After some discussion, we ultimately ended up working on prompt 1, where we were looking to see how much dribble penetration different defenders give up, with the hypothesis that good perimeter defenders wouldn’t really allow much dribble penetration.
This project essentially broke down into two parts: developing a model to predict expected values of shots based on defender distance and distance from the bucket and algorithmically developing a way to determine how much dribble penetration each player gave up. I worked primarily on the first, so much of my knowledge base is there, while Andrew, Zach, and Aditya worked on the latter.
Ultimately, we were able to develop a pretty interesting model, for which I’ll give the general form below:
Which is actually a pretty cool model and recovers some values that I thought were pretty true to real life. You can see my predictions below:
I suggested that the next few steps would probably involve analytically fitting the model, so it could be entirely mathematically represented. I was pretty close to doing so, but I couldn’t actually finish.
We also got pretty close to actually managing to get penetration data, but it took 2 hours in and of itself to actually read the files into Python (if anyone has any tips on reading really large text files, let me know, because I would like to learn), and then from there, it became a challenge to actually understand how to attribute penetration to players. They got pretty in-depth with it and we almost managed to have something but we were about 30 minutes late of the deadline, so unfortunately, we were not able to submit with that in time.
After the problem-solving part of the hackathon was done, Shareef Abdul Rahim (former Sacramento player) actually came out to talk to us which was also really cool.
We were then presented with the five finalists presentations, which, if memory serves, were as follows:
- A topological data analysis approach to defense in the NBA that examined how many dimensions different NBA defenders could address
- A logistic regression on a defender’s position that determined the best angle and distance from an offensive player (3rd place)
- a Defensive Versatility Index which evaluated how many positions an NBA defender could cover on the basis of change in EFG% of opposing player when covered by our given defender (2nd place)
- Heroball in the NBA Playoffs, which examined passing, dribbling, and movement in the playoffs compared to the regular season (1st place)
- I can’t remember the fifth but I will edit this if I do remember
Overall, it was a pretty cool experience and I hope they do this again next year!