Defining Modern NBA Player Positions - Applying Machine Learning to Uncover Functional Roles in Basketball

Published in

han_

12 min readApr 17, 2017

— Skip “WATER PISTOL PETE JR.” Bayless — 1.4 ppg — Position: Troll

Jalen Rose loves to say- player positions were created to help fans understand the game. Point guard, shooting guard, small forward, power forward, and center match up with the five positions that share the court together. But do all basketball players fit neatly into these five groupings?

The way the game is played today is dramatically different from even a decade ago. As players have begun to shoot better from longer range, teams have become smarter in exploiting the advantage of the three pointer. A typical benchmark for 2 point field goals is shooting 50%, yielding an expected production of 1 pt per shot. To match the expected production from 3 point land, a player only needs to shoot 33% consistently from behind the arc. There are ancillary benefits too— stationing players behind the three point arc spaces the floor by drawing defenders further from the basket to defend the threat of shooting. Spacing out the floor provides even more room for other players to operate inside the arc. Before, it was only smaller players — guards and wings— who would consistently shoot threes. Every position has since increased their share of three point shooting. This was popularized most notably by the 7 seconds or less Suns run by Steve Nash, where they played primarily without a center, and everyone shot quickly from all over the floor.

Three point shooting is more than a little man’s game.

As a result, basketball has undergone a positional revolution in the last 5–7 years. Increasingly, positions are becoming blurred as multi-dimensional players have emerged. Recently, the term “unicorn” has been coined to reflect the rare big man who can both defend the paint — a traditional role of a center — while also displaying guard skills like passing and shooting threes.

Battle of the NBA Unicorns

Porzingis, Embiid, and Giannis are once-in-a-generation talents — and they’re still under 23. But whose future is…

theringer.com

This concept of a “stretch 5” or a center who shoots threes can be traced back to Dirk Nowitzki who was the first true “stretch 5” to find great success. Years after Dirk reigned, the trend culminated this year in the Skills Challenge. Kristaps Porzingis, a 7'3'’ center from Latvia, drafted fourth overall by the Knicks last year, won a contest typically dominated by guards. It is visually astounding to see such a large player beating guards head to head in a contest of dribbling, passing, and shooting.

Hunting for Unicorns in the N.B.A.

In the past, a handful of basic precepts defined how the N.B.A. and the shoe-company nation-states dependent on its…

www.nytimes.com

In light of these changes, we need an effective way to designate positions in the NBA not based on physical traits such as height, but in terms of function, such as shooting and defense. A framework for modern NBA positions is important towards our understanding for how players have evolved, and effective roster construction. I set out to dissect the positional landscape in the NBA today.

My goal was to:

1. Use unsupervised clustering to delineate true functional positions of NBA players.
2. Uncover insight in the evolution of NBA player positions over time, and relationships between similar players.

Data Acquisition and Processing

I started with each players performance over one season as one data point. This would reduce the noise and help us gain a holistic view for a player’s contribution to a team’s performance for a whole season. I examined the time period from 2000–2016 — this is the post-lockout era (1999 was a shortened lockout season) and a time when this trend of positional fluidity began to accelerate. It was also the time when the Lakers’ dynasty started so it was just a beautiful era of basketball.

The player performance data was readily available on www.basketball-reference.com , which does an amazing job of tracking and archiving detailed basketball data. I wrote a web-scraping script in python using the Selenium library to compile the data from the team pages for each season (for example, the Lakers 2017 season page http://www.basketball-reference.com/teams/LAL/2017.html). After gathering the data, I was left with ~11k player-seasons over 16 years. I left out 2017 since the season was not over so the data was incomplete. I took season totals, per game averages, and per 100 possession averages for standard basketball counting stats — points, minutes, fouls, blocks, etc. I also took a number of advanced and rate stats such as usage, PER, VORP, block rate, true shooting, etc.

A summary of the player season performance dataset

Standardization

It was important to standardize the data so that I could effectively compare performance across different seasons. The pace has been increasing from 2000 to 2016, so it is important to examine relative performance within a season. I used the MinMaxScaler() method within pandas sklearn.preprocessing library to scale all of the data for each player relative to that season’s minimum and maximum performance. I was left with a dataframe that was between 0 and 1 for each attribute. Normalizing the data is also essential in distance-based clustering algorithms such as kmeans, hierarchical clustering, and DBScan which I was planning to perform. This makes sure that I am not overweighting a particular attribute where the range of possible values is wider than other attributes.

To reduce the noise, I took only players who played more than 500 minutes in a season. That eliminates scrubs who don’t play any minutes and also small-sample-size-wonders who can have put up unsustainable rate stats.

A histogram for minutes played- the distribution has a large spike for under 500 minutes, which I removed from my analysis.

Modeling the Data

The goal of this analysis is to use unsupervised learning techniques to identify relevant player clusters.

The final dataframe consisted of 81 attributes for each player-season. I purposely left in attributes that were similar and collinear. For example, total field goals made in a season is bound to overlap with the per game scoring of a player, in terms of describing a player’s overall performance. But including both in our analysis allows us to capture all of the variance that those variables are able to explain.

In order to use these collinear attributes and cluster effectively, I first needed to perform dimensionality reduction. Especially in unsupervised clustering algorithms, having too many dimensions creates the curse of dimensionality where distance between all points from one another are too large for clusters to be identified effectively.

Dimensionality Reduction

I used the PCA() method within the sklearn.preprocessing library to find the top PCA components that encompass all 81 features. The crucial functionality of the PCA library is that there are methods that show the variance explained by each component so that I can identify how much of the data is being described with each PCA component. It turns out that there are 12 components explaining greater than 1% of the total variance. In aggregate, these 12 components account for 92% of the total variance. Essentially, we have been able to summarize 92% of the 81 original features using only 12 PCA components.

I discovered that there are 12 essential Functional “Genes” that describe a player’s functional role. Examining the makeup of each gene based on their original feature contributions, I named each player gene shown below. The top 10 original attributes by magnitude that comprise each gene along with the gene designation is displayed in the following graphic.

The first 6 basketball “genes” that I have identified.

The next 6 basketball “genes” that I have identified.

The majority of a player’s production can be summed up by these 12 traits. For example, the “Scoring” Gene consists of playing many minutes, making many field goals, taking many field goal attempts, etc. The “Hustle Stats” Gene, conversely, consists of low free throw attempts, but high offensive rebounding, high defensive rating and blocks.

Already, we are gaining insight into not only what attributes describe a player, but what attributes go together. Players tend to excel in the same group of stats at once, whether they are offensive, defensive, or mixed groupings of attributes.

With our dataset reduced to it’s essential components, I could begin clustering.

Unsupervised learning- Clustering Techniques

I tested three major types of clustering- Kmeans, DBScan, and Heirarchical clustering. All three are distance-based clustering techniques. Kmeans and hierarchical clustering are more flexible because they can return an arbitrary number of clusters that the user specifies, but that makes interpretability more difficult because the user needs to be able to effectively choose this. DB Scan uses a different technique that only returns clusters which meet certain user-specified criteria. This makes DB Scan more informative, but can be difficult to tune the model in order to identify a practical number of useful clusters.

k-means clustering - Wikipedia

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster…

en.wikipedia.org

DBSCAN - Wikipedia

Density-based spatial clustering of applications with noise ( DBSCAN) is a data clustering algorithm proposed by Martin…

en.wikipedia.org

Hierarchical clustering - Wikipedia

In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method…

en.wikipedia.org

After testing, hierarchical clustering was the most effective method. DB Scan and kmeans both struggled to identify meaningful clusters. In particular, I believe that DB Scan struggled to create clusters because the data was very uniform, with no clear boundaries. Instead, what hierarchical clustering was able to do was slowly group clusters by distance from one another, and give us insight on which players are more similar to one another.

Dendrogram after performing hierarchical clustering. You can see the clusters begin to form as the algorithm runs to completion.

The hierarchical clustering that I used is within the scipy.cluster library. From this library, I used the dendrogram, linkage, cophenet, and fcluster methods. This scipy implementation uses agglomerative bottom up clustering. What hierarchical clustering does is take every point and find it’s nearest nearest neighbor. The next iteration, each of these clusters identifies its nearest cluster and combines to form another cluster, and so on. If the method runs to completion, all points have been grouped together. The key to using this modeling technique is to identify at what point to stop clustering and view the clusters that have formed.

SciPy Hierarchical Clustering and Dendrogram Tutorial

This is a tutorial on how to use scipy's hierarchical clustering. One of the benefits of hierarchical clustering is…

joernhees.de

Using a useful technique outlined in the link above, I can essentially choose a reasonable number of clusters to stop the clustering process. By viewing the speed and acceleration of distance growth, I can choose the point where the acceleration of distance growth is the largest. That is where clusters show the best separation.

tThe blue line is the speed of distance growth and the green line is the acceleration of distance growth. Spikes occur at 3, 6, and 11 clusters.

The plot shown on the left is speed (blue) and acceleration (green) of distance growth as the dendrogram is developed. There are spikes in acceleration at 3, 6, and 11 clusters. 3 and 6 clusters are too few- that is no more than the 5 “standard” positions that are part of our typical basketball lexicon. I wanted to separate players further by functionality, so I chose n=11 as the number of clusters.

Now, we can view the relevant clusters that have materialized:

Top 3 PCA components plotted. The colors designate the 11 different player positions, or clusters, that were identified.

Key Takeaways and Conclusions

What are the 11 modern NBA positions that were identified? I viewed the 11 clusters by their PCA components to understand the attributes that consist of each cluster. I plotted the median value of each of the 12 player Functional “Genes” for the 11 clusters.

By viewing the characteristics of each cluster, I was able to name each of the 11 positions. Effectively, modern NBA players exist as 1 of 11 updated functional positions defined by their statistical performance:

These are examples of current and historical players falling into each position.

The player groupings are very insightful- they tell us who occupy similar functional roles, across eras. Shaq and the Brow are both Elite Big men who essentially both excel at the “Scoring” and “Rebounding” gene. Elite wings are the most rare, with only 79 player seasons who qualify. Viewing the heat map, it becomes clear that elite wings are so special because they are a unique blend of a strong “Scoring” gene and strong defensive “Sticky Hands” gene. They also possess strength in other categories including “High-efficiency Production” and “Foul Drawing.” This makes players like Kobe and Lebron truly irreplaceable.

What this framework has also been able to classify are types of role players. Beyond the three elite player types, there are different bigs and guards who occupy specific niches. For example, defensive anchors like Ben Wallace and Andre Drummond rebound very well and protect the rim, but are mediocre scorers. Mid-Range monsters like Corey Maggette and Zach Randolph score and rebound well, but heavily specialize in 2 pt fgs. I am extremely encouraged to see players that are visually different but functionally similar grouped together. The Shaq-Kobe Lakers of the 2000’s were one of the first teams to feature a “stretch 4” with Robert Horry essentially spacing the floor for Kobe to operate in the midrange and Shaq to dominate the post. He falls into the same category as modern floor spacers such as Danny Green. In future posts, I will go into an in-depth study in the evolution of the Lakers roster in the 2000's.

Use Cases:

So with these modern positions identified, what are possible uses cases?

Player Development Profiles-

We can use this to view how players have developed throughout their career. The Kobe versus Shaq debate has echoed throughout the Lakers fandom. We can now use real data to see how their career trajectory evolved. Kobe was actually an elite player for longer, well into the last years of his career. In 2016, he finally succumbed to his achilles and became a functional Secondary Ball Handler. Shaq, on the other hand, had a few seasons at the tail end of his career where he was no longer an Elite Big, and was rather a Mid-Range Monster and even a Non-Scoring Big in his final years with the Suns and Celtics. He was able to squeeze in one last year of elite play on the Cavs in 2009 as Lebron’s sidekick.

Team Roster Development Profiles-

Worst decision ever.

Miami roster development profile after the Big 3 was formed.

Another use case is to examine roster development for teams over time. The perfect use case for this is one of the most intriguing roster constructions of all time- the formation of the Big 3. In 2011, Lebron and Bosh joined Wade in Miami to form the formidable and despised trio of the “Big 3” who would win not 4, not 5, not 6, championships together, only to lose in the Finals against the Mavs in their first year together. They were literally 3 of the top 7–10 players in the league playing on the same team. What does their positions tell us?

Lebron, Wade, and Bosh are all elite players. Bosh was an elite big from 2011–14. But interestingly, in 2011, both Lebron and Wade functioned as elite wings. Then, from 2012–14, Wade actually became an elite guard. This is a fascinating development: the first year they played together, Lebron and Wade were often cited as too redundant in the way they played and areas of the floor they occupied- this was distilled to the arcane argument of who was Batman and who was Robin. Well, we can start to answer this question with machine learning techniques now- it looks like Wade was the one to accommodate his game for the team. Wade took on more of a guard role, facilitating more, allowing Lebron to continue to be the elite wing. As their roles became more complementary, the Heat went on to win the next two championships and took the Spurs to the brink in the third year before breaking up the superteam.

The results from this analysis have be extremely fascinating and encouraging because it has confirmed many “gut” feelings that I have had and have consistently heard from the basketball community. But what I have begun to develop is a statistical and data-based framework to categorize players effectively and analyze positional development using statistics and functional performance.

This just scratches the surface for the types of analysis and conclusions that I hope to reach. There are many additional topics and analysis I hope to do in the near future, so stay tuned!

Below are a couple of next steps that come to mind:

Create a predictive model
In depth dive into the Lakers dynasty of 2000’s and their roster evolution
Look at the 2017 end of season stats and how those players fall into our modern positions
Generalize to predict playoff performance

If you have any suggestions for questions that I can try to answer with this analysis, feel free to post in the comments below.

Defining Modern NBA Player Positions - Applying Machine Learning to Uncover Functional Roles in Basketball

Battle of the NBA Unicorns

Porzingis, Embiid, and Giannis are once-in-a-generation talents — and they’re still under 23. But whose future is…

Hunting for Unicorns in the N.B.A.

In the past, a handful of basic precepts defined how the N.B.A. and the shoe-company nation-states dependent on its…

My goal was to:

Data Acquisition and Processing

Modeling the Data

k-means clustering - Wikipedia

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster…

DBSCAN - Wikipedia

Density-based spatial clustering of applications with noise ( DBSCAN) is a data clustering algorithm proposed by Martin…

Hierarchical clustering - Wikipedia

In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method…

SciPy Hierarchical Clustering and Dendrogram Tutorial

This is a tutorial on how to use scipy's hierarchical clustering. One of the benefits of hierarchical clustering is…

Key Takeaways and Conclusions

Use Cases:

Written by Han Man