Toronto Blue Jays: Player Projections

Elijah Cavan
Top Level Sports
Published in
3 min readMay 30, 2021

One of my favourite things to do when I was growing up watching sports was to check out the projected statistics for a player early on in the season. I’d run to the TSN app every time Ovie (Alex Ovechkin) scored a goal to see what he was ‘on pace’ to finish the season with. As I got older and started paying more attention to baseball it was the ZIPS projection on Fangraphs that got my attention. ZIPS can use previous historical player data (the player you are projecting- and other players that are statistically similar to them) to make projections before the season even starts! The analysis that follows will be more in line with the ‘on pace’ projection — and we’ll be doing it for several Toronto Blue Jays players. As of this writing, the Jays have played exactly 50 games- let’s see what statistics we expect the players to end up with at the end of the year.

One of the biggest parts of this project was getting the actual data. I used the MLB-stats api https://github.com/toddrob99/MLB-StatsAPI/wiki developped by Tod Robberts to get the data. For example,if you wanted the ops (On Base + Slugging) leaders on the blue jays for the 2021 season you would make a request to the team_leaders endpoint:

statsapi.team_leaders(141,'ops',limit=6,season=2021)

Where ‘141’ is the team id for the Jays. I similar requests like this to get the box score data for all the jays games up to tonight’s postponed game against Cleveland (Sat. May 29th). Then I wanted to project the statistics from now to the end of the season. The problem? Well the data was very noisy. In a typical week a batter might have hits per game that look like this [1 , 0 , 0 , 3, 2, 0, 1]- meaning he had 1 hit the first game, 0 the second, 3 the fourth and so on. To make the data easier to project I looked at the players cumulative sum of the statistics. For example the cumulative sum of the list I just gave you would look like: [1,1,1,4,6,6,7]. You add each element by the previous total. If you do that you get much more linear looking graphs:

Cumulative Runs per Player (image by author)

I showed the graph for cumulative runs over the 50 games the jays have played so far. You could come up with similar looking graphs for hits, home runs, walks, ect.

There are many methods to now project these stats forward. You might use Bayesian linear regression, or a machine learning model or neural net. I used the simple, tried and true method of ordinary least squares regression. Ordinary least squares (OLS) is just a method that aims to draw a line in which the points around that line are closer to that line on average compared to any other line you could draw. Luckily, even if you find that confusing, OLS is very simple in python thanks to the scipy package:

def get_estimate(df, player):

x = np.linspace(0,len(df[df.Name == player]),len(df[df.Name == player]))
y = np.cumsum(df[df.Name == player].Statistic.values)
slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)
return slope*(len(df[df.Name == player])+112) + intercept

This function gets me the estimates for any given statistic or player from the data I pulled from the api. I summarized by results in this table:

Toronto Blue Jays 2021 On Pace Projections (image by author)

You can get a lot of data from this table. For example, currently the average OPS of the league is low- like .705. So I could say that if that trend were to continue, Vladdy would project to have an OPS+ of 149; meaning he’s projected to be 49% better than a league average player (not to be confused with a ‘replacement player’- who would have an OPS much lower). Have fun looking through the projections- and be continually in awe of the season Vladimir Guerrero jr., and many of the Blue Jays hitters, are currently having offensively.

You can find the full notebook here: https://www.kaggle.com/sportsstatseli/bluejays-stats-prediction

As always, if you like what you read, consider checking out more of my work:

https://elicavan.wixsite.com/site

https://www.linkedin.com/in/elijah-cavan-msc-14b0bab1/

--

--