Pitcher Substitutions: Manager Performances

J.B.Moore, Ph.D
Published in
8 min readAug 23, 2018


All baseball aficionados have all seen a situation where a setup man has a 3-up, 3-out inning and the closer comes in in the 9th; immediately loads the bases and gives up a grand slam for a home team walk-off win. You wonder why the manager wouldn’t have let the setup guy pitch in the 9th? The statements “They left him in too long” and “They pulled him too soon” have been heard many times. This article presents a solution for measuring the quality of pitcher substitutions. It ranks the teams/managers based on substitutions during the first half of the 2018 MLB season. It’s likely no surprise that the Yankees and Astros are ranked the best and that Colorado and Kansas are rated worst.

We also rank starting pitcher performances by team and compare starting rankings to relief pitcher rankings. For example, a chart presented later shows that although the Yankees have the highest ranked relievers, the starters are ranked 16th best.

The AL and NL average 3.2 relief pitchers per game (see chart). Since pitching accounts for almost all of the defense results and baseball is a 50–50 game, the choice of starting and relief pitchers is very important. Hence the quality of those decisions is worth measuring. Batting, of course, is equally important but this article focuses exclusively on pitching performance.

Factors When Choosing Relief Pitchers

There are many reasons for choosing a particular replacement pitcher. Criteria fall into three main groups associated with: the pitcher being replaced, the game situation and the choices available. Attributes of the outgoing pitcher include: his pitch count, recent performances and age. Game situation factors include: the inning, the score differential, the next opposing batter, the on-base situation. Candidate choices are influenced by lefty-righty factors, tiredness, need to work, injuries, mop-up choices and experience to name a few. Some of these factors are subjective whereas others can be measured quantitatively.

When measuring the quality of choices made by a manager, one only needs to look at the results of the decisions and not why the choices were made.

How Should Pitching Substitutions Be Evaluated?

Hindsight is 20–20. Whether a substitute pitcher is good or bad depends on the results. A few of the classical pitching stats measure aspects of relief pitcher performance. The obvious ones are saves and blown saves. Much has been written about the weaknesses of these stats. In our view, the definition of “save” is open to argument and does not provide sufficiently good input to evaluate the quality of pitcher substitutions. More generally, it excludes specific factors such as the offensive threat at the time of the substitution and downstream performance factors measured by other pitching stats.

In our view, the answer to the question “Was it a good substitution or a bad one?” should be based on “Did the incoming pitcher make the situation better or worse compared to the expected results if no substitution was made? That is, we should compare the resulting performance of the incoming pitcher to the game-to-date performance of the outgoing pitcher. If the substitute pitcher performs better, it was a good decision. The same logic applies to replacing one relief pitcher with another — compare the performance of the incoming pitcher with that of the pitcher being replaced. The next section describes how this can be done.

Comparing Pitching Performances Using RunPlusMinus

In the article The Best Baseball Statistic arguments are made that the best single measure of overall on-field player performance needs to satisfy five requirements and that none of the classical stats meets all of these requirements. RunPlusMinus™ claims to do so and is derived from values of the four components — batting, running, pitching and fielding. Results for each component are values of the RPM statistic. Pitchers of course also bat, run and field, so it is important to extract the “pure” pitching performance from pitchers’ overall performances. In the charts which follow, pitcher performances are compared using the RPM values of the pitching component.

A second challenge is to make fair comparisons of pitching performances based on the number of batters faced by each pitcher. A starting pitcher may face 30 batters but a relief pitcher may only face 2 or 3 batters. Because RPM values are calculated for every player’s involvement in every play, a starting pitcher has an opportunity to “build up” a total RPM value that could exceed a relief pitcher’s total value regardless of a reliever’s outstanding performance. However, because there is a 100% correlation between team RPM totals in a game and winning the game, the primary basis used for ranking pitching performances of both starters and relievers is each pitcher’s total RPM value rather than his RPM value per batter faced.

Data From a Single Game

To explain how team rankings were obtained, we first looked at the data for one game. The data below for example, comes from the Texas at Detroit game on July 5, 2018. Texas used 4 pitchers. The Pitcher RPMs column shows that the starter Gallardo faced 23 batters and had a cumulative total of minus 2.38 RPMs before leaving the game. Pitchers in general have negative RPM totals for two reasons. First, RPM values attribute responsibility for both actual runs scored and also the change in potential runs in each play. Since runs are scored in every game and since pitchers have the primary defensive responsibility for runs allowed, it follows that pitchers RPM totals are usually negative. As explained here the RunPlusMinus methodology removes this bias against pitchers when calculating a player’s overall rating. However, in this article we are focusing on pitching performance only and not a player’s overall performance.

Note that the two relievers that follow Gallardo have positive total RPMs indicating they performed above average. The columns labelled “Increase” provide insight if comparing performance on a per-batter-faced basis. The arrows indicate how the “Increase” values are obtained. For example, Diekman was better than the starter by 2.43 RPMs per batter faced. On the other hand Kela’s total RPMs were -.67 RPMs (-.26 — .41) worse than Diekman’ RPM total.

The chart above shows that when using RPM values to evaluate each substitution, that the Texas management decisions resulted in 1 good pitching change (Gallardo to Rodriguez) and 2 bad replacements (Diekman and Kela).

Before looking at the Team & Manager results, the chart below shows some stats regading league differences.

As you would expect, the NL teams substitute for pitchers more frequently than the AL does. Since there are little differences in total batters faced (or equivalently, total plate appearances) in the two leagues, the Batters Faced per pitcher is slightly higher in the AL. The chart also shows that collectively, AL pitchers have slightly better average RPM contributions than the NL pitchers. See the article Which Team Has the Best Pitching for data that ranks overall team pitching performance.

Team Pitcher Substitution Data and Rankings

The following chart shows the results of analyzing data for games in the first half of the 2018 season. The data for each team is simply the counts, totals and averages of pitcher data for each team. An explanation of the column headings follows. The highlighted columns are the most important and have the following meanings: “Relief Rank” is the ranking of relief pitcher performances for each team. That ranking is based on the values in the column “Avg RPMs per 100 Subs”. These values are simply the average RPM totals of relief pitchers per 100 substitutions. The value of 100 was arbitrarily chosen to give values with understandable magnitudes. The Starting Pitcher Ranks are based on values in the Starting Pitcher RPMs column. The rightmost column is the difference between each team’s starting pitcher ranking and its relief pitcher ranking. The “Batters” column simply count how many batters were faced by relievers and starters for each team. Finally, the “Good Outcome” columns show how many incoming pitchers did better than the outgoing pitchers. For example the Rangers’ substitutions improved their game-winning potential 57.4% of the time and as such, the manager Banister could be considered as the best at making reliever choices

Comparing Starter and Reliever Performances

The preceding chart ranks the starters and reliever cohorts for each team. Of significant interest is the difference between the two rankings for many of the teams. The chart below is ordered by the decreasing values of (Reliever Rank — Starter Rank). For example, the largest positive difference is for Cleveland that ranked 3rd among starters but 24th among reliever staffs. The Yankees have the best reliever staff but their starters rank 16th. The data also shows that 10 teams have differences between 5 and -5 which suggests the two groups of pitchers on those teams are approximately equally competent or incompetent. Note also that the top performing teams year-to-date are in this middle group.


  • Relief pitcher performances are important in determining the overall success of a team
  • Many factors influence the choice of a relief pitcher
  • The quality of a manager’s choice can be measured by comparing the performance of a pitcher with the performance of the pitcher being replaced. Conventional pitching statistics do not have sufficient granularity to evaluate performance differences.
  • The RPM statistic from RunPlusMinus can be used to measure the quality of each substitution and produce rankings of substitution decisions for each team
  • The contributions of starting pitchers can be measured and compared to those of relief pitcher performances. There are large differences in the team rankings of starters and relievers.

Until next time…

Stay tuned for our future reports due out every week this season. If you want to be reminded whenever we release new content, please subscribe to our mailing list to be kept up to date!

If you have any questions, comments, requests or complaints, please feel free to add them in the comments below or to email us at info@runplusminus.com

You can learn more about the RunPlusMinus™️ statistic at RunPlusMinus.com



J.B.Moore, Ph.D

John B. Moore is a professional writer and speaker and Professor Emeritus at the University of Waterloo.