What does it mean to barrel a baseball?

Adam Maloof
SABR Tooth Tigers
Published in
9 min readAug 9, 2024

Whether building a pitch grader or monitoring progress in the cage, sometimes you just want to know whether a batter is finding the sweet spot on the bat.

Let me give you some context before launching into this analysis that ended up longer than I had hoped. At Princeton, we have multiple types of player evaluation that rely heavily on some form of grading batted balls. By grading I mean (generically) measuring the exit velocity and launch angle of a batted ball, and estimating how frequently that ball would be a hit or not. This practice has led to many conversations about what it means to barrel a baseball, and what hit grade(s) would be most effective to feed into our models. This question led me down the rabbit hole you can read below.

To hit the ball on the barrel means finding the ball with the sweet spot of the bat (typically on a hard swing). However, pounding the ball into the ground or hitting it straight into the air does not usually get you on base. So a barreled ball typically is defined in terms of both exit velocity (EV; the speed of the ball off the bat), and launch angle (LA; the angle of the ball off the bat, with 0 being horizontal, -90 being straight down, and +90 being straight up).

Strangely, the MLB rule for a barreled ball is not based on acceptable EV and LA ranges, but is instead defined as the collection of EV and LA pairs that satisfy AVG≥0.500 and SLG≥1.500 (Figure 1).

Figure 1. I use over 392,000 batted balls recorded by Trackman in D1 stadiums to construct this polar plot of exit velocity (radius) vs launch angle (angle). The MLB definition of a barreled ball is depicted by the yellow contour, where AVG≥0.500 and SLG≥1.500.

By this definition, the EV-LA barrel zone is remarkably similar in Division 1 college baseball and MLB (Figure 2). In fact, the mean exit velocity, mean launch angle, mean AVG, and mean SLG within the barrel zone all are within uncertainty of each other for D1 vs MLB (Figure 2). Even comparing the relative fraction of outcomes in the barrel zone shows a virtually identical distribution of play results for D1 and MLB (Figure 2).

Figure 2. The ‘barrel zone’ defined in exit velocity vs launch angle space is similar between D1 and MLB. The area of barrel zone is 6.5% larger for D1 than the MLB, but that result likely is due to the nearly 6x larger dataset for D1 (2023–2024 Trackman) vs. MLB (all fair balls from April 2022-June 2024). Play results within the barrel zone also are nearly identical between D1 and MLB, with a few more doubles in MLB and a few more outs in D1. As pointed out by mlb.com, the mean AVG and SLG within this barrel zone are far higher than the individual AVG>0.500 and SLG>1.500 requirements.

On the one hand, I find it interesting that this simple definition of a barreled ball results in virtually identical batted ball and play result statistics in D1 and MLB. On the other hand, why should this seemingly arbitrary definition of barrel rate be an effective statistic for evaluating players? Should a barrel really require a 50% home run rate (Figure 2)? Should laser beam singles also count as barrels even if they did not find gaps, especially in D1 where on base percentage is even more important than in MLB (Maloof, 2024a)?

My impression is that evaluators care a lot about barrel rate, and fold this concept in with expected stats to try to see through the noise/luck and get a sense of what kind of contact a batter makes (Maloof, 2024c). Various definitions of barrel rate also are an essential component of Quality At Bats (QABs), a favorite approach-centered statistic espoused by coaches at all levels.

However, defining barrels in terms of AVG and SLG muddles the barrel rate statistic and just turns it into another expected stat like xBA and xSLG (Note: an expected statistic is computed from probabilistic outcomes based on EV, LA, etc…, as described here). Would barrel rate be more useful as an independent statistic if it were defined strictly in terms of EV, LA, and possibly spray direction (SD)? Do a bunch of line-drive outs with low xSLG suggest you are seeing the ball well and going to hit more barrels that drop for hits? Or do a series of line-drive outs with low xSLG suggest that your triad of {EV, LA, SD} does not profile well for getting hits (or beating a shift)?

Unfortunately, I can’t think of a way to answer this question using college baseball TrackMan data (anyone have any ideas?). However, the 2023 camera upgrade to MLB Hawk-Eye systems allows for tracking of the swing path, and a purportedly accurate measure of bat_speed at contact. These new bat speed data have been available for MLB games since April 4, 2024, and now allow us to determine whether a hit was ‘squared-up’. ‘Squared-up’ really just means hit on the barrel or sweet spot of the bat, so perhaps now I can evaluate the relationship between a xSLG and a “squared-up” ball.

Statcast defines a squared-up contact as a hit that achieves ≥80% of the theoretical maximum exit velocity given a bat velocity and pitch velocity (Equation 1).

Equation 1. Nathan (2003) derives an expression relating ExitSpeed (the exit velocity of the ball off the bat) to a combination of ReleaseSpeed (the velocity of the pitch) and BatSpeed (the velocity of the bat at contact) through a single collision efficiency, q. A few notes: (a) I use these variable names to be consistent with variable naming of Statcast data on Baseball Savant. (b) ReleaseSpeed really should not be the pitch velocity measured at release, but rather the ZoneSpeed measured at contact. ZoneSpeed is a function of the ever-changing drag coefficient, but frequently is estimated as ZoneSpeed≈0.92*ReleaseSpeed.

The closer a hit is to being squared-up, the higher q will be, and the higher the exit velocity will be for a given pitch velocity and bat speed. I need to know the maximum theoretical collision efficiency, qₘₐₓ, in order to determine the range of observed q values that equate to squared-up hits. However, we do not know the value of qₘₐₓ a priori, which likely depends on the weight of the bat, species of wood, the grain orientation and density of wood, temperature, humidity, etc. I can estimate q empirically for any hit by rearranging Equation 1, and plugging in directly measured values for ExitSpeed, BatSpeed, and ReleaseSpeed.

Equation 2. Rearranging Equation 1 shows how the collision efficiency, q, can be estimated for any combination of exit velocity, bat speed, and pitch velocity observations.

The maximum theoretical collision efficiency is the right-edge of the distribution of q observations depicted in Figure 3.

Figure 3. 135,494 batted balls (after removing outliers beyond three times the interquartile range) indicate a maximum possible collision efficiency of qₘₐₓ = 0.214 (remember that I am using ReleaseSpeed instead of ZoneSpeed, so you should think of this qₘₐₓ as a convenient coefficient, rather than a strictly accurate material property). I estimate the 95% confidence interval for qₘₐₓ (0.192–0.237) by finding the two inflection points at the top and bottom of the right edge of the kernel density estimation (KDE) in 1000 bootstrap resamples. The KDE is non-Gaussian with a mean of q=0.078, suggesting the average MLB batted ball has 36% collision efficiency.

Armed with this estimate of qₘₐₓ (Figure 3), I can evaluate the relationship between squaring-up a ball and the raw (exit velocity, launch angle) stats (Figure 4).

Figure 4. Squared-up fraction is defined as q/qₘₐₓ for each batted ball (ignoring bunts, etc.). (Left) Hits become more common than outs (red) when squared-up fraction exceeds ~0.8 and exit velocity (launch speed) exceeds 100 mph. (Right) Hits are more common than outs when launch angle is between 10–20° (line drives), regardless of how low squared-up fraction is. When squared-up fraction exceeds 0.8, the hit region expands to include launch angles between 5–35°.

Figure 4 illustrates why q/qₘₐₓ ≥ 0.8 might be both a good and bad Statcast definition of squaring-up the ball. In the context of exit velocity, q/qₘₐₓ ≥ 0.8 marks a boundary to the left of which few hits are found (in other words, when exit velocity is above 100 mph, there are virtually always more hits than outs, while when exit velocity drops below 100 mph, even for very high q/qₘₐₓ, outs are still more common than hits). However, in the context of launch angle, hits are found at virtually any q/qₘₐₓ as long as they fall in a narrow range of line drive launch angles. Building on the analysis in Figure 4, now I want to see how squared-up fractions map into MLB barrels, and visualize distributions of exit velocities and launch angles across q/qmax ranges.

Figure 5. (Left) The distribution of squared-up fractions is narrow and centered at q/qₘₐₓ=0.864±0.117 for MLB-barrels (AVG≥0.500 & SLG≥1.500; Figure 1), while the distribution of squared-up fractions for non MLB-barrels (AVG<0.500 or SLG<1.500; Figure 1) is virtually uniform. Studying the region of overlap between MLB-barrels and non-barrels reveals consistently higher exit velocities (Middle) and launch angles (Right) for MLB-barrels compared to non-MLB-barrels. Furthermore, as squared-up fraction increases, non-MLB-barrel exit velocities increase and launch angles decrease, suggesting that those nearly perfectly squared-up balls are low line drives and grounders that translate into outs and singles, not reaching the MLB-barrel criterion.

Figure 5 tells the story behind the pattern we already intuited from Figures 2 and 4. The MLB definition of a barrel is just a modified version of expected slugging percentage (xSLGfb). This MLB barrel definition includes batted balls with mostly high q/qₘₐₓ, but does not cover the spectrum of hits that were squared-up (Figure 5). Even in the group of most thoroughly squared-up hits (q/qₘₐₓ > 0.9), singles represent more than 25% of the outcomes, and home runs less than 15% of the outcomes (Figure 6).

Figure 6. Each horizontal line has an x-range equal to the range of squared-up fractions included in that calculation, a y-position equal to the xAVGfb or xSLGfb for that collection of hits, and a label along the y2 axis depicting the number of batted balls in that collection. The horizontal lines are colored by the relative fraction of each possible outcome (out, single, double, triple, home run). For each narrower range of q/qₘₐₓ, xAVGfb and xSLGfb increase, the fraction of outs decreases, but the fraction of singles remains nearly constant. For comparison, pie charts depicting the relative fraction of each outcome for batted balls satisfying the MLB definition of a barrel (AVG ≥ 0.500 & SLG ≥ 1.500, n = 4753) have y-locations equal to their xAVGfb and xSLGfb. Even compared to the collection of most squared-up balls (q/qₘₐₓ = 0.90–1.05), MLB-barrels have much higher xAVGfb and xSLGfb, virtually no singles, and a much higher fraction of home runs.

With all that analysis in mind, I suggest we rescue the term “barrel” from the MLB definition’s emphasis on xSLGfb. Studying exit velocity and launch angle in terms of expected slugging percentage is valuable and predictive, but not an accurate measure of whether a batter is hitting the ball on the barrel of the bat.

For the new college baseball pitch grader I am building right now, xSLGfb is important, but I also want to include information about how often a pitch is squared-up (and we just saw that the MLB definition of a barrel does not help). The new Statcast bat speed data are revolutionary for measuring squared-up fraction… but alas, there are no bat speed data in college. Is there any way to measure squared-up fraction without bat speed data? What if I could estimate q/qₘₐₓ without bat speed?

In Figure 7, I ask how variable a typical player’s bat speed is. It turns out, the majority of players with greater than 100 batted balls in 2024 have bat speeds that vary by 8.5±1.7%.

Figure 7. 398 players have put more than 100 balls in play so far in 2024, and 84% of those batters display <10% variability in their bat speed.

If individual players have fairly consistent bat speeds (Figure 7), maybe I can design a simulation that estimates the collision efficiency of each hit (and thus estimates whether the ball was squared-up). Here is my setup:

  1. Figure 8 Left: Identify a hitter’s most squared-up balls by searching for the highest ExitSpeed/ReleaseSpeed ratios. This search is under-constrained without BatSpeed, but if an individual’s BatSpeed does not vary too much (Figure 7), it might be okay.
  2. Figure 8 Middle: Assume q/qₘₐₓ ≈ 1, and qₘₐₓ = 0.214±0.022 (Figure 3) for these batted balls with the highest ExitSpeed/ReleaseSpeed ratios, and use Equation 2 to estimate BatSpeed. Propagate additional uncertainty in the BatSpeed estimate by adding the 10% random noise typical of MLB hitters (Figure 7).
  3. Figure 8 Right: Estimate q’ using a Monte Carlo simulation of Equation 2 that draws from the distribution of possible BatSpeeds for each batted ball.
Figure 8. Ground truthing an example of collision efficiency estimation without bat speed data. (Left) I search for Guerrero Jr.’s batted balls most likely to have a high squared-up fraction by studying the distribution of Exit Velocity / Release Speed ratios. (Middle) I assume that these ‘best barrels’ have q≈qₘₐₓ, and plug them into Equation 3 to estimate bat speed. I include an additional 10% random noise to simulate an MLB player’s typical bat speed variability (Figure 7). While the Kolmogorov-Smirnov test suggests that the distribution of true bat speeds is different from the distribution of estimated bat speeds, the two distributions are similar, with nearly identical means and comparable standard deviations. (Right) Using the simulated bat speeds to estimate q, and then comparing true q to estimated q, I find a slope that is within uncertainty of m=1 and explains 88% of the variance. The average 1σ uncertainty on any single estimate of q is σ(q’) = 0.0265.

Figure 8 depicts an example of trying to estimate q when there is no bat speed data available. Studying MLB players (for whom we have bat speed data) allows me estimate the typical uncertainty on an estimate of q. Let me illustrate two examples. If I estimate q’=0.2051, then my squared-up fraction calculation would give:

where the total uncertainty of 0.158 comes from this uncertainty propagation for division:

I would conclude that the lowest likely q/qₘₐₓ is 0.80, which still classifies the hit as properly squared-up. In contrast, let’s say I estimate q’ = 0.1722, then:

In this case, if I use the the Statcast threshold for squared-up fraction of 0.8, then I would only be right to call this ball squared-up about half the time. Furthermore, while I have tried to propagate uncertainty as carefully and conservatively as possible, there are significant additional sources of uncertainty that become important if I apply this method to college baseball: (a) BBCOR bats have a higher and more variable qₘₐₓ for which I have no empirical data (except lots of confusing experiences conducting the bat testing before games!), and (b) college players may have less refined approaches and more variable bat speeds.

I think the take home message is that without bat speed data, we can’t reliably measure how completely a batter squared-up the ball. For our pitch grader, and other player evaluations, we are going to have to slum it with intelligent use of xAVGfv and xSLGfb for now. Hawk-Eye, please bring your system to college baseball! While we wait for Hawk-Eye, we will be attempting to constrain bat speed with a stereo-pair of iphone cameras (240 fps vs. Hawk-Eye’s 300 fps) and a little structure-from-motion magic to see whether we can measure squared-up fraction at least for our guys during practice.

--

--

Adam Maloof
SABR Tooth Tigers

Prof. of Geosciences, studies the coevolution of life & climate in layers of rock, works on baseball analytics, shags flies, farms figs & flowers, plants trees