A Look at which NBA Champions have been the Most Important over the Last 20 Years.

Shea Versey
INST414: Data Science Techniques
6 min readFeb 24, 2024

Here’s a question, over the last 20 years of NBA basketball there have been hundreds of players crowned as champions, but which players among these have been the best of the best? I believe using network data on past finals teams that we can learn which players have been most important to winning NBA championships. The NBA Honors Committee will have interest in the answer to this question. Every single year there are roughly 15 championship players, but this alone is not enough to warrant entry to the Naismith Memorial Hall of Fame. Finding answers with this data can be a key piece of information on determining which players have truly dominated the league over the recent decades, and greater insight to there Hall of Fame potential.

To answer this question it requires data for each championship roster in the last 20 seasons. Ideally this data shows the full roster for each team in lists so that I can iterate through them to create a network. The data would include fields such as team name, year of championship, roster size, and player names. Not all of these fields will be necessary for crunching the numbers but they are necessary for context. This is relevant because with each roster available, I can create nodes for every single player and create relationships based on whether or not they have played together.

As it turns out the data I was looking for was not readily available as I had expected. While searching for data I found lots of data with pieces of what I needed. Lots of data with team stats for past NBA champions but they didn’t include the player roster. Other data would just include the roster data for one team rather than having all the teams in one dataset. In the end I had to collect some data myself and create usable data frames in python using pandas. I used statmuse.com to individually import data from each championship winning team dating back to 2004.

For this analysis, a node represents a player from any of the NBA teams that has won the NBA finals between 2004 and 2023. There are roughly 15 players on each team in the NBA. This particular data includes 267 different NBA players, therefore 267 nodes in this network. The relationship that is being analyzed and represented as an edge if players have played with each other on the same roster when winning the NBA championship. So for example lets take Lebron James and Anthony Davis. These are two players who won the NBA Championship in 2020 as teammates on the Los Angeles Lakers. Therefore these to nodes would share and edge on the graph however a combination of Lebron James and another player, like Stephen Curry, would not share an edge because they never won an NBA championship as teammates.

In this network analysis importance will be defined by degree as a centrality metric. Similar to how social media accounts may be considered popular or important based on degree we can make that same analysis here. Players who have a higher number of edges are players who have either won more championships or won more championships with a larger variety of players. The reason this is important is beause in professional team sports it can be hard to rank individual greatness based on team success because that's simply the nature of a team sport. However individuals can show more evidence of their talent by winning with various teammates which suggests that they are in fact the common denominator for winning championships.

The three highest ranking nodes based on degree centrality are Danny Green (0.214), Lebron James (0.203), and Stephen Curry (0.173) respectively.

This image of code shows the top 10 players/nodes based on degree centrality

Now let us look back to the question we posed earlier. Now that these nodes have been ranking using degree centrality, how does this answer the question regarding which of these players deserve Hall of Fame consideration from the NBA Honors Committee. Looking at the results we see that Danny Green, Lebron James, and Stephen Curry are the most important NBA champions over the last two decades. It is important however to consider context when looking at this data. Let’s take Lebron and Stephen for example, these two players have been universally recognized as some of the greatest players of all time. Lebron in particular is deemed as the G.O.A.T. (Greatest Of All Time) by many people. This is important because both these players are going to be no-doubt Hall of Famers after they retire and there ranking among the results validates the merit of degree centrality on this data. The more interesting player is Danny Green who many people may have never even heard of. Danny Green is a three time NBA champion winning with the Spurs in 2014, Raptors in 2019, and the Lakers in 2020. His difference from Lebron and Steph is that Danny Green was objectively not one of the top two players on any of the championship teams he was a part of. He is what is considered a role player in the NBA, not a superstar like Lebron and Stephen. Based on this context it is very unlikely that Danny Green will be a future NBA hall of Famer however, this is a very strong argument for him to be considered one day, because of how impressive it is to win championships on different teams with so many different teammates. He is one of just 4 players in NBA history, which also includes Lebron James, to have won championships on 3 or more different teams in their career.

The cleaning process for this data was simple considering that I mostly put the data together myself. Once I imported the necessary data into a data frame, I narrowed each data frame for every NBA team down to just a list of player names from each roster.

Data Frame for Golden State Warriors 2018 Championship Team
List of lists with each inner list being the roster of a different NBA Championship Team

The only real cleaning required was removing some of the coaches names in the data. There was a bug where the data I imported would sometimes include the team’s coach which is irrelevant to this analysis, so I removed those nodes from being included. Otherwise the data was accurate and usable.

Potential limitations of this analysis stem from the lack of context. People seeing this data without background expertise on the NBA likely would’ve assumed Danny Green is a more important player than Lebron James and Stephen Curry, which isn't true. This data does not include individual stats making it more difficult to derive accurate conclusions. There is missing data from the previous forty plus years of NBA history which this data set did not include. The results would certainly differ if more years were included however that would’ve also warranted a larger result set than three to make a accurate interpretation of the results.

This is one of multiple centrality metrics to evaluate NBA greats from the last 20 years and surely other metric could offer insight as well. Lebron and Stephen, we will see you in Canton. Danny Green, good luck.

Link to GitHub repo

--

--