Quantifying Defense in College Baseball

Connor Curtiss
Iowa Baseball Managers
8 min readFeb 23, 2022

In Major League Baseball, advanced defensive metrics like Defensive Runs Saved (DRS), Outs Above Average (OAA), and Ultimate Zone Rating (UZR) can objectively measure a fielder’s ability. But without player tracking, how can we quantify defensive value in college baseball? I set out to create a defensive metric for college baseball using Trackman data. The metric uses a model that predicts the probability of an out and estimates the position responsible for the play. Finally, the number is adjusted for the park, season, and positional averages, creating an objective defensive evaluation metric at the collegiate level.

Creating the Model

Statcast’s model for expected stats uses only the exit velocity and launch angle of a batted ball to predict xBA (Expected Batting Average). But if we add spray angle into the equation, it gives a more accurate representation of how often a batted ball is a hit. More importantly, it can give insight into a team’s defense. My model uses exit velocity, launch angle, spray angle, and distance to estimate each batted ball’s probability of being a hit. We can use this number to grade fielders on each individual play. It is important to note that errors are included in this definition of hit. Otherwise, errors would be treated as outs and be credited to the value of a defender. The plot below shows the hit probability for all batted balls in our dataset.

In order to grade each defender individually, it was necessary to infer which position is responsible for each play. I used traditional defensive alignments and historical data to create cutoffs between positions. Due to the unpredictability of when batted balls are fielded by pitchers and catchers, they are omitted from positional assignment and all balls in play are assumed to be the responsibility of one of the other seven defenders. While this method of inferring the fielder can be vulnerable to shifting, it is the most effective way to assume fielder responsibility with the lack of player tracking.

Calculating Outs Above Average

Using the estimated position and hit probability, I could calculate a rough Outs Above Average (OAA) for every position on every team. Let P(Hit) denote the probability of a hit and let Hit denote whether or not each ball in play resulted in a hit. Simply put, OAA = P(Hit) - Hit. For example, if a batted ball had a .900 hit probability and it was caught for an out, the defender would be credited with +0.9 OAA (.900–0). If it was a hit or an error, the fielder would receive -0.1 OAA (.900–1).

This rough calculation is already an upgrade from fielding percentage, but it needs some adjustments. First of all, we need to adjust for park sizes. A center fielder playing at a field comparable to the massive size of Comerica Park would likely have an inflated OAA with this method due to the large center field area. Conversely, a left fielder playing somewhere similar to Fenway Park will have his OAA deflated on account of the limited left-field area due to the Green Monster. To adjust for this variability, I used stadium averages to account for differences in outfield area, while leaving infield numbers unaffected.

Next, I had to adjust for seasonal and positional differences. I found the average OAA for each combination of season and position, and graded each defender relative to their appropriate average. Without this adjustment, defenders in recent years would likely have higher OAA because defensive positioning has improved over time. The finalized Outs Above Average statistic incorporates exit velocity, launch angle, spray angle, distance, position, season, and stadium.

Use Cases

Now that we have finished calculating OAA, we can explore the multitude of insights that we can gain from this metric. Here is a simple visualization of a team’s defensive value by position in 2021.

Each value represents the OAA for the team at the position indicated. From this visualization, a team can gain insights into where their defensive strengths and weaknesses are. Evidently, the right side of the infield is strong while the outfield is weak. If you track which fielder is playing where and when, you can compare defensive ability between specific players to make more informed decisions regarding your defensive lineup.

We can get even more granular and view each fielder’s performance by direction. Since each batted ball is attached with an OAA value, we can filter by different directions and create visualizations that show where a defender struggles, and where they excel. With this information, coaches can adjust their player development plans and positional alignment strategies ahead of games. The visualization below shows a center fielder who is much better going to his right than to his left.

It is important to keep in mind that we don’t know exactly where the center fielder was positioned at the beginning of each play. He could have been shading towards left-center for most plays throughout the season, which could explain this skewed distribution. But assuming that there’s not any extreme abnormality in the positioning of this defender, this visualization is extremely beneficial in assessing how a fielder has been performing, and where he can improve.

We can also analyze trends throughout a season by looking at a player’s OAA in different parts of the year. Coaches can gain insight into what is or isn’t helping a player’s defensive development. This graph shows a rolling sum of Outs Above Average for a third baseman throughout the 2021 season. It certainly appears that this player improved with his glove as the year progressed.

For infielders, it can be useful to see how each defender handles soft and hard contact. Below, we can see that the first baseman and second baseman from this team struggle with hard-hit balls while the third baseman and shortstop handle hard-hit balls well but struggle with soft contact.

It is easy to point out that soft contact creates more difficult plays for shortstops than it does for first basemen, which could explain these numbers. However, the model accounts for exit velocity and spray angle, so there is no bias regarding which positions will handle different types of velocity better than others. Everything throughout this process is graded against the average, including what is shown in this table.

You might be wondering if there is a significant difference between the best and worst defensive teams. Why should a team care about this metric if it barely matters? As it turns out, the difference between the top and bottom defensive teams in 2021 was over 100 outs.

Even at one position, there is a huge difference between the top and bottom of the leaderboard. Here, the best team at the shortstop position added 30 more outs than the worst.

Naturally, we can also look at how this metric correlates with winning games. Obviously, it is not a perfect predictor, but it is evident that defense has a positive relationship with winning percentage.

Limitations

The biggest limitation in this process is the uncertainty of whether the defensive value can be attributed primarily to defender ability or positioning. A third baseman will receive substantially negative OAA when a left-handed hitter hits a routine grounder to third for a single because the defense is shifted. If a ground ball is hit hard up the middle, it will have a high probability of being a hit. But if the defense is shifted, the shortstop might make the play with ease. The shortstop’s OAA for the play will be high, but should it be? He barely had to move. Obviously, more of the defensive value for that play was due to the positioning of the fielder rather than his talent. As a team, you want to optimize both your defender’s ability and positioning anyway, but it would still be valuable to know how much is attributed to each.

Secondly, there is some ambiguity about which position is responsible for a play. Especially today, when shifts are more common, it is hard to be certain whether a grounder up the middle is fielded by the shortstop or the second baseman. Additionally, we do not have historical data for the specific players that fielded each ball, so we cannot assign OAA to individual players, but rather the position as a whole. When one player plays a position for practically the entire season, it isn’t an issue, but when multiple players play substantial parts of the year at a position, it becomes unclear how much value each fielder is contributing. To avoid this problem, we plan to tag the fielder responsible for each play this season to improve our defensive analysis.

Lastly, this method only assigns defensive value to the fielder at the estimated position. This means if a first baseman drops a throw from a shortstop, the shortstop loses value instead of the first baseman. If there’s a routine grounder to first and the pitcher forgets to cover, the first baseman will receive negative OAA. The metric does not measure the quality of throws either. If a batter hits a shallow sacrifice fly and the outfielder makes a poor throw home, he still gets positive OAA just for making the catch. While the effect these plays have in the long term is minor, it is still something to keep in mind when analyzing this statistic.

Conclusion

Advanced pitching and hitting data analysis has entered the college baseball world with full force, but defensive analysis has been almost completely neglected. To be the most successful, data-driven program possible, every part of the game needs to be objectified. Fielding percentage is an outdated method of assessing defense. Our version of Outs Above Average quantifies defensive value, provides numerous actionable insights, and has the potential to revolutionize fielding analysis in college baseball.

--

--