My Terrible, No-Good, Awful Word2Vec Draft Model (No, Seriously)

You may have seen previously that I started playing around with word vectors trained on a DraftExpress corpus consisting of pre-draft player profiles. (If you have no idea what I’m talking about one sentence into this article, you should definitely go back and read the first one.) I weak-promised a Part 2, and here it is, not because the draft model I came up with is good — it’s not — but because it’s a) interesting and b) potentially redeemable and/or useful anyway, at least, in terms of building future models.

Ok, so the premise here is that the word vectors trained on DX profile — and only those — are going to be the features for our model. I am not adding any other features to the current model, because I wanted to see if just the word vectors alone have any predictive value. The target variable I chose is very basic: Did a player become an All-Star, All-NBA “something”, or otherwise a clearly well above average NBA starter? Out of the 700 or so players in the data set that were used for training, roughly 10% fit these rather loose criterion. Clearly, because the target variable is binary, we are building a classifier. Using logistic regression, however, we will end up getting a probability score, which we can use to rank players. A player ends up looking to the model something like this (all_star=1 is a player like Stephen Curry):

Row(player='Kim English', model=DenseVector([0.0128, -0.0, -0.0444, -0.0138, -0.015, -0.0774, -0.0802, 0.0367, 0.076, -0.0195, 0.0381, -0.0022, 0.0641, -0.0265, -0.1075, -0.0722, -0.0444, -0.1443, 0.0047, -0.0259, -0.1148, 0.032, 0.0295, 0.0438, -0.0131, 0.1138, -0.0498, 0.0563, 0.0256, -0.1178, -0.0226, -0.0307, -0.0746, -0.1515, -0.0092, 0.0388, -0.0125, -0.1422, 0.1292, 0.0476, 0.0705, 0.0738, 0.0027, 0.0132, -0.0531, -0.0351, 0.0605, -0.0113, 0.0866, -0.086, -0.0206, 0.0489, -0.0431, -0.1, -0.0441, -0.0339, 0.0702, 0.0413, -0.0841, 0.0391, 0.0152, -0.0418, -0.0194, -0.0178, -0.0471, 0.0203, -0.0217, -0.0769, 0.0263, -0.0366, -0.0162, 0.0984, -0.0413, -0.1392, -0.0174, 0.0072, -0.1132, -0.0468, 0.0194, 0.0594, -0.0459, -0.0885, -0.0024, 0.0076, 0.0029, 0.0314, 0.0454, -0.0879, -0.0146, -0.0929, -0.0175, -0.0685, 0.0676, -0.0616, -0.0224, -0.0618, -0.0088, 0.0037, 0.1133, 0.021]), draft_year=2012, all_star=0.0)

For the actual training of the model I used all data going back to 2005, except I held out players from the 2016 and 2017 Drafts. The model is probably still biased, because even some players from 2014 or 2015 surely might go on to become stars or above-average starters (but probably not that many that it would make a material difference here).

The model was trained using 10-fold cross-validation, and the training AUC of the best model (i.e. best set of hyperparameters) was 0.79. The average AUC of the hold-out during cross-validation was 0.70. To put that into perspective, a coin flip is 0.5 and a perfect classifier would have a score of 1.0. In many real-world cases a classifier with an AUC=0.70 could be perfectly useful. But for looking at basketball players, it’s…well, did you read the title of this post?

Let’s get to it then. Here are the rankings for the 2016 Draft Class:

2016 Word2Vec Draft Ratings

(Sorry about the formatting, have no idea how to embed tables in Medium, so I’m using this hacky gist.)

Among your immediate reactions to this right now may be:


Are you over it? I said the model isn’t good, like I explicitly said that (read the title again!). It may not be as bad a coin flip, though. I mean, gun to your head, would you take the top 1/3 or bottom 1/3 of this list? In the top 10, you have Hernangomez, Chriss, and Murray who all look like the will probably be productive starters for a long time. Might even argue that Labissiere is in that category. In the bottom 10, Dejounte Murray is about the only guy who looks like an NBA player, although Brogdon just falls out of the bottom 10.

With that, let’s look at this year’s crop of players (I’ve taken the DraftExpress Top 100 prospects list, which has a bunch of players that either won’t get drafted at all, or may even have withdrawn already.)

2017 Word2Vec Draft Ratings

I know, I know.


These ratings/rankings are probably mostly useless, so I’m not going to proclaim any hidden “sleepers” (although I’m sure that won’t stop some of you from doing it!).

There are a few takeaways, though. First, we have to consider the various sources of error at work. 1) Draft Express, as much as they have written over the years, is a fairly small corpus for NLP features such as Word2Vec. Although the word vectors themselves seem pretty legit (when looking at word similarities as in the previous article), each player only has at most a few thousand words written about them. Some players only have 1 article and a couple hundred words. Perhaps, in the future, I will just filter these players out (filter them from the draft model, not the word vector training). 2) Draft Express player opinions/projections are not perfect. Duh. I mean, clearly, DX may be much more excited about certain players than others, and that may swamp out any real signal we’re getting. 3) Ideally, what we need for building useful NLP models for the Draft is a lot of data and a lot of unbiased, objective (as much as is possible) analysis of player traits. 4) And maybe this is the most important point I can make, textual data doesn’t have to stand-alone. I think it is entirely likely that if we combine numeric features (such that already exists and everyone is already using) with these kinds of draft profile data, we might be able to bump up the predictive power of our models. It wouldn’t have to improve that much to make it worth doing!

Ok, now go talk with your friends about how Evan’s model has Cam Oliver #8 and how much of a steal he is going to be for some team. You have my blessing. Enjoy Draft Night!