Statistical Physics of Basketball
Hey,
This week has been a fascinating week for basketball. Kevin Durant announced that his next destination would the 2015 champions and the 2016 runner-ups, the Golden State Warriors, which led to the Warriors having to clear cap room by letting go of Harrison Barnes, Andrew Bogut, Festus Azeli, and a couple of others. Tim Duncan’s alleged retirement announcement is looming, after a 19 year career with 5 championships, a couple MVPs, and a couple finals MVPs on his resume. Ben Simmons showcased some incredible court vision in the Utah Summer League, where it felt like his passes were essentially stealing the show.
The most fascinating moments in basketball for me came from off-the-court, and in basketball history. This past week, I read two particularly fascinating papers, Random Walk Picture of Basketball Scoring and Basketball scoring in NBA games: an example of complexity, and got to see a particularly interesting approach to basketball. While it’s undeniable that both papers are pretty reductionist, with the assumptions having to be particularly strong for the problems they examine to be tractable, I still think it’s an interesting approach and lends itself to some further analysis to see how some teams do and don’t buck the trends suggested. For example, in a Random Walk Picture of Basketball Scoring, we are effectively introduced to a theory of runs in basketball, as the paper seeks to identify the probability distribution function of scoring runs. While the data is aggregate in nature, it lends itself to an interesting project that would assess what I think is a commonly held notion in most basketball circles — jump shooting teams are subject to a lot of variance. If this hypothesis is essentially true, we should be able to look through play by play data and see that the scoring runs of jump shooting teams, such as the Warriors, are significantly more random than the scoring runs of teams whose identity is based on drives and two point shots, like the Grizzlies. In order to do this, I would need to figure out a way to scrape the ESPN server of data on play-by-play information, and then from there use Python or R to sort the data. I have some experience in doing this, but I suspect I’ll need some time to begin thinking about this in detail.
The other paper, Basketball scoring in the NBA: an example of complexity, demonstrates a scale-free relationship between point difference and frequency. Scale-free relationships are of interest in statistical physics because they indicate the presence of a phase transition. For example, they’ve been used to consider thermodynamic systems through the study of critical exponents. They’re also characteristic of self-organized criticality, which is present in proteins and evolution. I’ve actually gotten the chance to study, and even publish a little, about such concepts. I think the results from this paper could be interesting, but they do somewhat reflect something we already know — if the point differential gets too high during a basketball game, after a certain point, the game is no longer worth watching because the outcome is already essentially determined. I wonder if teams take studies like this into consideration when deciding line-ups for “garbage time.” It’d be interesting to see if that’s the case for sure, but I’d imagine it’d be pretty bad for viewership — the NBA can’t have people tuning out just because the score differential exceeds a certain threshold, even though I’m sure a lot of people already do.
Anyway, those are just some interesting things I read about this past week. I hope someone else finds this stuff as interesting as me.
Regards,
VS