So… can NCAA free throws really predict NBA 3-point shooting ability?

ricky
8 min readJan 20, 2018

--

In this study, I take a deep dive into the numbers, discover whether NCAA free throws can actually predict NBA 3-point shooting and develop my own statistical model to accurately predict 3-point percentages for a number of prospects.

NCAA free throw percentage is often used in NBA prospect evaluation as an indicator of a player’s ability (or lack thereof) to translate his NCAA three-point shooting to the next level.

There are reasons to believe it can be. Very few good three-point shooters are poor free throw shooters. Free throw percentage also gives you an idea of a player’s touch and consistency of shooting mechanics. But is there actually any reasonable amount of correlation between the two?

That’s what this article will explore. Three-point attempt rate (3PAr), three-point percentage (3P%) and total three-point attempts (3PA) are also considered indicators, so we will take a look at those numbers as well.

But let’s start with free throws, then make our way into other stats and stat combinations.

Jayson Tatum jumps off the page as a recent prospect whose free throw percentage (FT%) was a better indicator of his NBA three-point shooting than his NCAA 3P% was. He is shooting an absurd 46% after making just 34% at Duke. He made 85% of his free throws at Duke, however.

How common is this? To find out, I dove into a historical base of 400 players. The sample wasn’t completely random — about half were pulled from a Basketball Reference play index list of players to shoot at least 750 attempts in the NBA and play in the NCAA, and the other half consists of (almost) every player drafted since 2011 to play at least 100 NBA games — I designed it so it would be balanced with some old, but mostly recent players (because threes are shot differently in a team context than they were 30 years ago, but I wanted the majority of the sample to have a 3P% truly reflective of their NBA shooting ability, hence the 750 attempts).

The tricky part of building the sample was accounting for the possibility of a player being a non-shooter. For example, if a player would make only 10% of his threes over 500 attempts, chances are that player will have very few NBA attempts. Or he could be like Hassan Whiteside for example, who is 2-for-2 from three in his career for a perfect 100% 3P%. That screws with the data a little bit because suddenly, a player with so-so indicators is the best three-point shooter of the entire sample.

I decided to just stick with it anyways, because most of the non-shooters (like the Plumlees, Steven Adams, Tristan Thompson and Jahlil Okafor) had terrible percentages despite their few attempts. Overall, I don’t think having players with few career attempts skewed the results too much. Excluding them would have probably made it even more skewed. There simply needed to be players in the sample with low 3PAr, low 3PA totals, low 3P% and low FT% that also had small 3PA samples in the NBA because otherwise the sample would have been entirely comprised of somewhat-good shooters.

After all, isn’t the point of this exercise (and looking at peripheral indicators in general) to predict with smaller sample sizes? If a player shot 500 college threes, you probably have a good idea who he is as a shooter. The interesting conversations surround the player who made 40 of 100 and has a great FT%, or 60 of 120 and has a bad FT%.

I took all 400 players shooting stats partially by importing the play index queries as an Excel workbook (shouts out to Basketball Reference for including that feature) and partially through arduous manual data entry. Their 3P%, 3PA, FT% and 3PAr at both the NBA and NCAA levels were taken.

Google Sheets is handy because like Excel, it lets you manipulate data in tons of fun and useful ways with an easy-to-understand language. I standardized the variables by converting all those stats into percentiles using “=PERCENTRANK.” That put them on the same 0–1 scale with roughly the same levels of variance. FT% is generally on a scale of about 40% to over 90%, a range of 50%, while 3P% is generally either 0 or between 25% and 45%, 3PA can be anywhere from 10 to like, 100; and 3PAr has its own range too. Putting them on the same scale makes comparing players a little less confounding.

So, after standardizing the variables, the next step was to find the correlation (r) between the various NCAA percentiles and NBA 3P% percentile using “=CORREL.”

For FT%, we found r=0.511. Any r>.5 is worth considering as having some modicum of predictive power, but 0.511 is not exactly foolproof. We can probably do better.

I ran “=CORREL” for every individual variable and the medians and averages of every combination of the four.

Of the individual variables, FT% had the second strongest correlation with NBA 3P%. Unsurprisingly (and for a somewhat unceremonious conclusion), NCAA 3P% had the strongest. But the different combinations generally were much stronger than any of the variables individually.

The strongest correlation overall was the average of 3P%, 3PA and FT%, where r=0.704.

Again, 0.704 isn’t incredibly strong, but it is strong enough that it might have some actual predictive power. So, I put all the players’ NBA 3P% in a column next to a column with their player score (their average percentile of 3P%, 3PA and FT%) and calculated the slope and y-intercept in the interest of turning these findings into an actual NBA 3P% predictor tool.

I toyed around with that for a while, adjusting its settings for every individual variable and combination, finding the median errors for every player’s predicted 3P% and their actual NBA 3P% in order to test the tool’s accuracy.

The results here were very interesting.

The lowest median error of predicted NBA 3P% vs. actual NBA 3P% was for NCAA 3P% at 2.575. The player score (the combination of variables with the strongest correlation with NBA 3P% percentile) was fourth-lowest, behind FT%, 3PA and the average of FT% and 3PA. Basically, although the player score had a stronger correlation, NCAA 3P% was predicting NBA 3P% more accurately in the linear equation.

My gut instinct was that NCAA 3P% was benefiting from predicting good shooters with large samples at both levels extremely accurately while struggling with poor shooters with smaller samples. Since the latter group made up a smaller part of the player base, I hypothesized that it was cheating a little by using its accuracy on large-sample players to cover up its poor results with smaller-sample players. So I sorted the player base by total NCAA 3PA and tested the formulas again.

Indeed, NCAA 3P% predicted the top 50 players in total NCAA 3PA extremely accurately. It was almost dead-on. But the player score predicted the bottom 100 with far greater precision.

So, since the player base consisted mostly of shooters with at least 750 career attempts, 3P%’s median error was lower. But if you want to predict NBA 3P% with, say, 150 or fewer attempts, the player score was a better bet.

When you’re evaluating talent, the goal is essentially to make the best possible guess based on relatively limited information, especially in the one-and-done era where a player like Jayson Tatum does not create a significant statistical footprint in the NCAA, but is clearly highly talented and deserves the attention of scouts and evaluators. Making informed evaluation with small stat samples demands use of more information than just 3P% or FT%. Things like team context, shot contexts and shooting mechanics matter a lot. Even hand size and arm length have been floated as possible indicators of shooting ability.

Apparently the average of a player’s 3PA, 3P% and FT% percentiles can also be useful.

So now that we’ve found out how to predict NBA 3P% with some degree of accuracy, let’s put the predictor to the test!

Like I said, the historical base did not include players from the 2017 draft. So (most of — we’re excluding Markelle Fultz and Bam Adebayo because they have combined for 4 NBA 3PAs) 2017’s draft lottery will be our test case, albeit with the caveat that their careers are young and the samples are small.

Lonzo Ball:

  • College: .412%
  • NBA: .303%
  • Predicted: .332%

Jayson Tatum:

  • College: .345%
  • NBA: .457%
  • Predicted: .339%

Josh Jackson:

  • College: .378%
  • NBA: .262%
  • Predicted: .299%

De’Aaron Fox:

  • College: .246%
  • NBA: .295%
  • Predicted: .285%

Jonathan Isaac:

  • College: .348%
  • NBA: .278% (has been injured for most of season)
  • Predicted: .321%

Lauri Markkanen:

  • College: .423%
  • NBA: .373%
  • Predicted: .378%

Dennis Smith Jr.:

  • College: .359%
  • NBA: .327%
  • Predicted: .315%

Frank Ntilikina:

  • Europe: .38%
  • NBA: .309%
  • Predicted: .299%

Zach Collins:

  • College: .476%
  • NBA: .324%
  • Predicted: .336%

Malik Monk:

  • College: .397%
  • NBA: .339%
  • Predicted: .378%

Luke Kennard:

  • College: .383%
  • NBA: .429%
  • Predicted: .389%

Donovan Mitchell:

  • College: .329
  • NBA: .348
  • Predicted: .341

So clearly, our model isn’t perfect, but is generally a much stronger predictor of NBA three-point shooting than NCAA 3P% or FT% are.

This Google Sheets document has an interactive tool that you can use either to enter your own stats or pick a 2018 prospect from a dropdown menu. Most of this year’s projected high-lottery talent has automatically-updating stats in this document, but others might be a little out of date, so feel free to just check their Sports Reference pages for yourself.

Let me know if you have any thoughts or feedback on the doc, this article or literally anything else at ricky!

--

--

ricky

contributor for The Stepien, Nylon Calculus & Orlando Magic Daily.