Minor League Stat Stickiness — Pitchers

Jon Anderson
The Sports Scientist
4 min readJun 15, 2020

In this post, we went through an extensive analysis looking into the relationship between Minor League and Major League hitter stats. For the full details and results of the study, check that post out.

I will quickly rehash what we are doing here for anybody who did not read the first hitter post.

The goal is to find which Minor League statistics are most predictive of Major League statistics at the individual player level. If we can find some statistical categories that players usually stay relatively consistent in from their Minor League career to their Major League careers, we might be able to be better at evaluating players in the future. This is especially useful for fantasy baseball purposes when you are trying to identify rookies that can contribute to your fantasy team.

Here was my process in attempting this challenge:

  • Retrieve all Minor League pitching stats for the last 10 seasons (2010–2019 for levels A+, AA, and AAA)
  • Retrieve all Major League pitching stats for the last five seasons
  • Loop through every player that had 50 or more innings pitched at the Major League level over the 2015–2019 seasons and compare their Major League statistics with the Minor League statistics
  • Find the overall correlation coefficients for each category to give evidence on which categories are the most predictive

Results

There is really not much else to say other than give you the results, so this will be an overall short post. What we found here is much less predictive power coming from the Minor League stats. I checked five categories: K/9, BB/9, K/BB, ERA, WHIP and HR/91.

Predictive :: Strikeouts and Walks

There is no overwhelmingly strong correlation with any of these categories, but strikeouts and walks come the closest. Strikeouts per nine came out as the winner, with a correlation of .59 — moderate strength. You can definitely see where a straight line would go through those points. There are not many pitchers that saw their strikeout rates change drastically when making the move from the Minor Leagues to the Majors Leagues, as evidenced by the white space in the top left and bottom right corner of the plot.

Walks per nine innings was similarly related, but with a less strong .47 correlation coefficient.

Taking the direct ratio between strikeouts and walks results in a correlation of .36, and we that is overall much harder to post a high K/BB ratio in the Major Leagues than in the Minors — this is not a big surprise.

Non-Predictive :: ERA, WHIP, Home Run Rate

You probably figured that ERA wouldn’t have a super strong correlation between the Majors and the Minors, but I was definitely surprised by how weak the relationship is. ERA came in with a puny correlation coefficient at just .18, suggesting almost no relationship whatsoever. Look at this plot:

This is just a random smattering of points, Minor League ERA has essentially no predictive power.

WHIP is firmly non-predictive as well, but pitchers not to the ERA extent. It came in with a .26 correlation coefficient. This makes sense because are figuring in walk rate in the calculation, and we already knew that was somewhat predictive. Here is the plot.

Still largely a random dump of points, but there is a little more white space in the corners as compared to ERA.

Home runs allowed per nine came in with a similar score to WHIP as .24. This was expected because so much of your home run rate depends on your league and home ballpark environment. Some ballparks are just way more conducive to the long ball than others, which makes contextless prediction a pretty bleak affair.

Conclusion

Pitching is much the same as hitting, with the only real predictive statistics being the things that the pitcher himself has full control over. Any stat that is influenced by environmental factors out of the pitcher’s control will really randomize our plots.

With hitting, we saw strikeout and walk correlations over 0.7, but with pitchers we did not even reach the 0.6 mark. This is very notable and should be considered when evaluating pitching prospects. While their strikeout and walk numbers are the best ones to look at if you have to pick, the relationships are really just too weak to inspire any confidence in prediction.

Nobody said it better than the great Niels Bohr: “It is very hard to predict, especially the future.”

Click here to see my Google Colab Python notebook that has all the code I used.

--

--