Do NBA athletes play harder when the spotlight shines bright?

Kavi Munjal
7 min readMay 1, 2020

--

Dennis Rodman, one of the all-time great hustlers in NBA history.

Of course, it’s no secret that when the lights shine brightest, in the NBA Playoffs, teams and players step up their games. But there are also moments during the regular season when the stage is bigger, whether that be in the form of a rival matchup, a push for the playoffs, a nationally-televised game, or a contract year for a particular player. The latter two of these present the opportunity for individual players to make a name for themselves, to put themselves in the headlines league-wide or nationwide. Do players increase their effort and performance in such situations?

In this project, I drew on data from several different sources. For season totals and averages, I scraped 67 variables from Basketball Reference dating back to the 1990–91. While Basketball Reference also has salary information, it does not maintain the historical salary payrolls of teams. Instead, I scraped my salary information from HoopsHype, which maintains formatted salary tables dating back to the 1990–91 season in both nominal terms and adjusted for inflation to today’s dollars. I retrieved box score information for individual games for the 2019–20 season from NBA.com, which has 133 variables ranging from traditional to sophisticated movement tracking data. Scraping data was rather difficult, especially with NBA.com, which utilizes a lot of JavaScript and does not have as accessible of an API as it seems it once did. I found a package on GitHub, nbastatR, which was useful in some capacity in scraping all of these sites (especially Basketball Reference), but has a lot of functions which did not work successfully. All of these datasets were cleaned, filtered, and joined in different capacities for different analyses.

Determining what statistics embody performance and effort was difficult, especially when trying to utilize some of the earlier salary data. Many advanced metrics and much of the underlying advanced data only go back a handful of years. For these analyses stretching further back in time, I settled on using the popular stats that are meant to capture a player’s overall performance and value add: Box Plus-Minus (Offensive, Defensive, and overall), Value Over Replacement Player, and Player Efficiency Rating. The stats themselves are not necessary to understand in detail for the purposes of this analysis. I simply measured how players in their contract year (final year of a contract), compared to their average non-contract year performance.

Determining and assigning the binary variable for contract years was another challenge. The salary datasets did not specify what was a contract year, so I had to set the thresholds myself. I decided to assign a player a contract year if they did not have a salary the next season, or if there salary changed by more than 10% in either direction. This certainly led to some error due to retirement/extensions/options etc., but that threshold seemed the most reasonable given the maximum allowed extensions/options. The salaries I used also do not account for inflation over the years, thus leading to many lower-paid players.

I calculated players’ “outperformance” by comparing their metrics in games during their contract year to their career average in games in non-contract years. I did this by first creating a table filtered for non-contract years and summarizing the average of the selected metrics by player, then joining that table back onto the main stats dataframe. Once I had outperformances calculated, I took the yearly average and set them along the 30 year period for which I have salary data. The first chart shows Box Plus-Minus, broken down further by the Offensive and Defensive metric. Overall BPM and OBPM tend to move together, while DBPM remains relatively stable. This shows that players tend to try to up their offensive game in contract years, while their defense remains about the same. That does not always pay off, however, as players actually underperformed in OBPM in contract years in late 2000s and early 2010s, dragging down BPM with it. It is possible that players tried to force too many plays on offense in an effort to make a good impression.

Next we have Value over Replacement Player. This metric was relatively volatile and could have been a result of some outliers slipping through the filters (players who played very few minutes could be outliers in these metrics, so I filtered players with less than 48 minutes of play). There is also a decreasing trend over time.

A decreasing trend is seen again here in Player Efficiency Rating, with a slight increase in recent years similar to Box Plus-Minus. The metrics all seem to tell a similar story: Players in the 90s tended to turn in better performances than normal in their contract years, while players in the early 2010s tended to turn in worse. Perhaps further digging into basketball history could reveal the root. Contracts have gotten shorter over time by mandate, and pay increases larger and larger. Perhaps the sheer number of players in contract years led to too much hero-ball or too much pressure to consistently outperform. In a regression, contract year came back as significant as a predictor of all three of these metrics, with a coefficient around -0.695, meaning actually tend to underperform in their contract year.

The chart below plots Box Plus-Minus outperformance against the salary that player was paid in that year. As mentioned, the growth in pay over time has led to a large grouping towards the lower end of salary. There are still some takeaways to be made. Defense still seems to hold steadier than offense, although the trend lines going across the screen show a slight decrease in defensive performance as salaries increase. Perhaps higher-paid players feel they are more compensated for their offense rather than defense. On the lower end of salary there is more variance in performance. Lower-paid players may be making a larger effort to push for a good impression going into contract negotiations, which seems to result in both more out- and underpeformances.

The other aspect I attempted to address was effort in nationally televised games. I could only scrape the national TV schedule from NBA.com for this season. This was fine since there were already 973 games played in this currently halted season, and hustle and tracking statistics only go back four or five years anyway. I created another binary variable for national TV games based on whether each game ID appeared in my scraped schedule. I then calculated similar “outperformance” statistics for the variables below.

Thanks to state-of-the-art player tracking technology, NBA.com has tracking statistics on how far and how fast a player is moves in each game. I decided to examine distance run and average speed and see if these hustle metrics differed in national tv games. I also labeled the points according to whether that player is in a contract year. As seen below, deviations in average speed are not too significant save for a large group of what appear to be outliers or really tired players, maybe playing three games in four nights. Performances vary more in distance run, with more and greater outperformances than under. Contract year does not have a particularly noticeable difference, although there are more outperformances in average speed than under. So, maybe players do run harder on national tv with a contract on the line.

Finally, I wanted to see whether players play more selfishly in nationally televised games. To do so, I examined outperformances in number of passes versus outperformances in shots attempted. Rather than shots made and assists, which represent results, these stats represent process. I found that the two metrics are positively correlated, meaning that players who take more shots on national tv also tend to pass more. Thus, it may be that certain players simply try to get their hands on the ball more when there are extra cameras in the building. This group does not appear to be the contract players, though, which are scattered along the trendline. These players do seem to have less outliers, which means they may actually try to keep closer to their normal ratios of passes to shots.

As someone who has always been fascinated by sports analytics, this project inspired me to dig deeper into my passion and try to derive more trends and conclusions from the incredible amount of data available, however tricky to procure into a form suitable for analysis. The toughest part of this project was definitely scraping, cleaning, and combining the data. Now that I have it, however, I am excited to dig further into this project with examining inflation-adjusted salaries, determining contract years more cleanly, and assessing effort at the quarter and season progression levels. I also hope to use the variables to create models to optimize team performance, player fit and value, and my beloved 76ers odds of winning a championship.

Kavi Munjal is a senior at the University of Pennsylvania, studying Finance and Business Analytics with minors in Computer Science and Data Science. He is a former president of the Penn Sports Analytics Groups and a lifelong 76ers fan. This data project was undertaken for his Digital Analytics class. The large letters in the introduction spell out “OIDD” in recognition of the class code.

--

--