Are NBA Draft Class Analysts Really Experts… My Testing Proves Otherwise

Danny Leese
The Sports Scientist
7 min readMay 8, 2020

In recent weeks there has been a lot of discussion regarding the upcoming NBA draft class. Most media members covering the NBA have considered the 2020 NBA draft class to be weak.

Are draft analysts able to accurately predict if an entire draft class is strong or weak? The short answer is probably not, but I dived into the numbers to answer this question which proved to be a very challenging task.

The methodology I used was to collect data regarding the sentiment around each draft class prior to the actual draft taking place. The sentiment of a draft class is referring to whether the media predicts that the draft class is strong or weak. I then compared the sentiment of each class to their actual performance over time.

I began by looking at targeted Google searches within the year prior to each draft in order to find the sentiment of each class. I used several variations of words like “2015 NBA draft.” Although there were several articles on the topic, most of them were mock drafts with only one line in the whole article alluding to the draft class being strong or weak. This resulted in noisy text data that would be hard to clean. My next approach was to look at Twitter posts in the months prior to each draft. Although there was more direct discussion of the draft class sentiment, there were not enough credible Tweets (i.e. verified accounts or minimum likes) on the draft class in the early 2010’s.

The next approach I attempted was to look at the language in the analysis section of players on NBAdraft.net.

By looking at the language used for every player in a draft class and comparing it to other years, I could determine which class had better sentiment than others.

This process is called opinion or sentiment analysis. I used text mining tools to understand the emotional context of text. To do this I analyzed the sentiment of individual words using the NRC Word-Emotion Association Lexicon (NRC). The NRC is a list of words associated with an emotion (i.e. fear/trust) or, with a positive or negative sentiment. After summing all the positive and negative sentiments in the text, I could assign a sentiment score to each draft class, or in other words, evaluate which mock draft the authors preferred.

I chose to use nbadraft.net because they used the same format for each year and are widely considered an excellent source for NBA draft information. Of course, using more draft analysis websites would have been ideal but for language consistencies sake, it made the most sense to only use one website.

I looked at every players analysis from the 2010–2020 NBA draft simply because that is how long nbadraft.net has been creating draft analysis. This left me with nearly 500,000 words to investigate. Here is a word jumble of the top 300 most commonly used words.

For those interested, here is the R Script which collected all 660 names from the last 11 NBA drafts.

Naturally there were some problems assessing NBA analysis using NRC. Several words that are generally considered negative words, are considered positive when thinking about an NBA prospect. Below is a list of the 25 most common positive and negative words used in the mock draft analysis.

In real life, being called “explosive” and “aggressive” is insulting, but when it comes to being in the NBA, these are the traits players aspire to have. This is a slight drawback of the analysis, but overall it hardly affected the process because most of the negative words were accurately represented.

The scoring system works in terms of a sentiment score. The score is calculated by the amount of positive words minus the amount of negative words. The chart below on the left displays the aggregate details of each pick sorted by the total sentiment. For example, total sentiment of first overall draft picks scored the highest sentiment. As well, you can see from the chart below on the right, there is clear right skew distribution between draft position and sentiment. As expected, the higher draft positions generally scored a higher sentiment.

I also considered that when a reporter writes about a draft class as being strong or weak, they are referring to the top picks. You rarely hear a draft analyst writing about a draft being weak because of the players in the second round. Therefore, I adjusted the data to only look at the sentiment of the first ten picks. This proved quite successful because as you can see below, the sentiment of the first ten picks is much more differentiated than compared to the next 20.

Using the aggregated sentiment of the first ten picks in each draft class, here is the resulting draft class sentiment ranking.

The 2018 draft class ranked as the most highly touted of the list which featured, Markelle Fultz, Lonzo Ball and Jayson Tatum who were high profile prospects entering the draft. This year’s 2020 class was the second least touted class which seems to align with the sentiment of the media.

After determining the sentiment of each class, it was time to assess the performance of each draft class. There are several ways to determine the on-court performance of a draft class. I chose to use average minutes per game of the top ten players as the variable to determine performance success. To check the accuracy of minutes per game as a performance indicator, below is the top ten minutes played per game over the past five seasons.

Although it is not a perfect indicator, it works well for evaluating performance, and is the best variable to analyze the past ten drafts. For one, minutes can capture the potential of the younger players. Of course, there will be outliers but if a player is playing high minutes in his second or third year, there is reason to believe he is going to improve even if his career statistics aren’t up to par yet. For example, Trae Young and Jaren Jackson from the 2018 draft have an equivalent Win Shares/48 minutes to about league average. These two players are projected to have well above average careers which is portrayed in their minutes averaging 33 and 27 minutes respectively. On the other hand, rookies with high minutes who do not exhibit potential, will have their minutes fall off in year two and three. For example, Kevin Knox from the 2018 draft class saw his minutes fall from 29 to 18 minutes per game in his second year. In addition, if a veteran player has averaged high minutes throughout his career, he has likely had a successful career and his statistics will depict a similar story.

If I were to use standard performance measurements (i.e. PER, Win Shares), it would heavily favour the older players. For example, using Win Shares/48 minutes, the 2011 draft class would rank first. The three highest Win Shares/48 minutes in the top ten picks of this draft are Kyrie Irving, Jonas Valancuinas, and Enes Kanter. It would be hard to argue that this draft is the best performing draft class in the last ten years.

Finally, the moment you have been waiting for — these are the results of the draft class sentiment vs. the draft class performance from 2010 -2019.

As you can see, the sentiment does not follow a similar pattern to the performance. There was only a 0.27 correlation between sentiment and performance. This led me to believe the opinions regarding a draft class are not a strong indication of performance. Even looking at the sentiment of the entire first round of each class, there were similar results — only a correlation of 0.03, meaning there is no relationship between sentiment and performance.

Looking forward:

Analyzing the performance of an entire draft class proved to be very difficult, especially if the players are still playing. In another five years this type of analysis will only be easier; there will be a larger data set to use, the draft classes performance will be clearer, and the predictive power of the draft sentiment will be more accurate.

So, does this mean the sentiment surrounding a draft class is useless? I can confidently say it is likely not a strong predictor of how a draft class will perform. And if teams are persuaded to trade into or out of a draft because of the negative sentiment surrounding it, this is a likely bad strategy, but that’s a topic for another time.

--

--

Danny Leese
The Sports Scientist

Director of Basketball Analytics — Western University Men’s Basketball Team