RunPlusMinus 2018 MLB Team Performance Summary

Ivan Lukianchuk
RunPlusMinus
Published in
8 min readJan 14, 2019

Introduction

The article here claims that the RunPlusMinus methodology is the best single statistic for measuring the on-field player performance of MLB players. Over the course of the 2018 season, play-by-play data was collected and analyzed for all 2430 MLB regular season games. Because RunPlusMinus statistic values are additive, they can be used to measure team performance as well as player performance. This article provides information that includes:

  • A concise summary of the RunPlusMinus methodology
  • Evidence supporting the claim to be the best single statistic
  • Ranking the teams on Batting, Running, Pitching and Fielding (All MLB teams, AL Teams, NL Teams)
  • Updated Expected Run values calculated from 2018 games. (Expected Runs is a foundation of OpenWAR, RunPlusMinus and other respected performance-measuring methodologies.)

The year-end report of player performance can be found here.

RunPlusMinus Methodology

The RunPlusMinus statistic calculates a “better off” value for each player’s involvement in every event in a game. That is the result of each player’s actions makes his team better or worse off. These better off values can be accumulated to give a positive or negative total that measures how much each player’s performance is above or below average. The principles and methods used to calculate RunPlusMinus values satisfy the five CRAZI (see below) properties that are necessary — but not sufficient — to claim the title of best overall performance statistic. These are:

  • Comprehensive: must measure every player’s involvement in every play in every game
  • Run-based: must be measured in runs (the only objective in a game is to score more runs than the opposing team)
  • Additive: team performances must be the sum of player performances
  • Zero-sum: Offense and defense values must be equal and opposite in every play
  • Independent: player performance in each play must be independent of player performance values in preceding plays (This is the reason for using Expected Runs values)

The underlying formula that assigns a performance value to the offense team in every play is:

RPM Value = runs scored + change in potential to score runs

Where the change in potential to score runs = (Expected Runs at the end of the play) minus (Expected Runs at the start of the play).

The RPM (better off) value of each player involved in a play is a fraction of his team’s RPM value where the fraction reflects the player’s degree of responsibility for the outcome of the play.

Because runs are scored in every completed game and because pitchers have the largest defensive responsibility, the raw RPM values for each player are biased by the player’s position. This bias can be eliminated by using standard deviations and appropriate weights for each of the four performance components (batting, running, pitching, fielding). The result is the RPM Rating statistic that measures each player’s overall contribution to success.

Evidence Supporting the Claim of Best Baseball Statistic

Of the 117+ baseball statistics listed in Wikipedia, only the RPM statistic satisfies all five CRAZI criteria. (A comparison with OpenWAR can be found here).

Some examples. Most statistics, such as Batting Averages or ERAs are either offense or defense measures and are therefore not comprehensive or zero-sum. Many are not additive because they involve aggregate measures such as averages or use regression methods. (For example, a team’s batting average is not the sum of the players’ batting averages.) RBI values are not independent of a player’s position in the lineup. Stats such as caught stealing are stated as fractions and are not run-based. The RPM Rating statistic does satisfy all five necessary criteria.

Question 1: Does the Game Winner Always Have a Positive RPM Value?

Yes. Team RPM totals at the end of a game are the sum of the RPM values in each play. The play value for each of the opposing teams can be positive or negative and the sum of the two team values is always zero. As you would expect, a team that wins by a large margin has large positive RPM total and a team that loses by one run has a small negative RPM value. The relationship between the winning run margin and the average RPM value of the winning team is shown in the chart below. The values are derived from all MLB regular season games in 2018.

The correlation coefficient is 99.9%. It is not 1 because there are variations in how the run difference was achieved. The “Max” line shows that for some run differences there were significant differences in the RPM winning margins. This can happen when for example, two games are won by a single run. If one of them had many double plays and the other had many 1–2–3 innings the RPM values of the winning teams will be different.

Question 2: Do RPM Team Ratings Predict League Standings?

As the season progresses, how do league standings correspond to RPM Ratings? The chart below shows the relationship between team RPMs and games won in the 2018 MLB season. A chart containing the values of wins and RPM totals is included later in this article. The correlation between these values is 96%. It is not higher because league schedules are unbalanced. (Teams do not play each other the same number of times and therefore the strength of the opposing team is different for each team). Since the total RPM values of the two teams in each game is zero, if necessarily follows that the average of teams’ RPMs over the course of the season is also zero.

Question 3: How good are RPM Ratings in predicting game winners?

Predictions of game winners are based on:

  1. Each team’s RPM Ratings in the previous 28 days
  2. The performance of the probable starting pitchers
  3. Which team has the home field

The accuracy of predictions for games in the 2018 season was 57%. Recognizing that the best teams only win approximately 66% of their games and that there are other probabilistic variables in every game, this is a very good result. In the 2019 season, we will provide daily forecasts of the winner in upcoming games.

Team Performances

There are 3 charts of team performance — one for the combined 30 teams and one for each of the AL and NL. In each case the best and worst performances are high-lighted in green and red respectively

MLB Rankings and Ratings

The chart below rates all 30 teams with respect to team wins, RPM totals and component performances. In addition, the four rightmost columns show the actual team salaries published by Spotrac Payrolls. The column ‘Performance Based” team salary is based on teams RPMs and intended to show what a fair payroll would be based on a team’s performance. The Over/Under columns show the difference between the Performance Based salary and the actual team salary. For example, the Red Sox were justified in paying their players 128.2 million but in fact, their salary total was 190.3 million meaning that they overpaid their players by 62.1 million. Note that the original “moneyball” team — the Athletics got excellent results from their payroll. (They underpaid their players by 64.6 million getting their 97 wins,) The Tigers were best at getting-what-they-paid-for with an overpayment of only .1 million.

AL Team Performances

The chart below shows the rankings within the AL teams. The Red Sox had the highest payroll and ranked first in batting but only 5th in pitching. The Rays were the most frugal by underpaying their players by 70.1 million while still having the 6th highest number of wins in the AL.

NL Team Performances

The chart below shows the rankings within the NL teams. The chart shows for example that although the Diamondbacks, Nationals and Pirates each had 82 wins, that the Pirates achieved this with the smallest payroll of the three teams.

It is interesting that in the AL the Overpaid and Underpaid total are quite close whereas in the NL the difference is 30 million primarily due to the high salaries paid by the Giants.

Expected Runs Update

Each completed half inning begins with zero out and the bases empty and ends with either three out or a walk-off win. Each play begins in one of the 24 possible [bases occupied, outs] states and usually ends in a different [bases occupied, outs] state. Everyone understands that the offensive team has a greater probability of producing runs starting in some states than in others. The table below shows these Expected Runs values used when calculating RPM calculations during the 2018 MLB season. For example, if the bases are empty and nobody is out the average runs scored in the remainder of the half inning is 0.497 runs. Expected Run values are easy to calculate by keeping track of the offense team’s score at the start of every play and its score at the end of the inning.

The row and column totals in the table only show the relative importance of each bases-occupied and number-of-outs respectively.

One example. Suppose there is 1 out and players on 2nd and 3rd and the batter hits a double scoring 2 runs. The RPM value for the offense team equals 2 + (0.733–1.487) which equals 1.246 runs. The RPM value is less than 2 because the potential to score runs at the end of the play is less than it was at the start of the play. That is — “it’s not just what you make but what you leave”.

Although each season has slightly different Expected Runs values, the values in the preceding table were used for this and all other forthcoming end-of-season reports. The Expected Runs values for “pure AL” games and “pure NL” games differ very slightly. One could go further and calculate Expected Runs for each team’s home field but our testing shows that this would have a negligible effect when looking at RPM values for the entire season. The values in the preceding table will be used for RPM calculations in the 2019 season.

Conclusions

  • The RunPlusMinus statistic measures how much each player makes his team better off by his involvement in a play
  • The RunPlusMinus statistic has significant explanatory power and good predictive power
  • The power arises from the 5 important characteristics that underlie the methodology of RunPlusMinus calculations. No other published statistic satisfies these five criteria
  • As well as demonstrating almost perfect correlation between 1) Winning run margins and RunPlusMinus rating margins, and 2) Games won and RPM totals, RPM values can be used to suggest team payrolls that are justified by each team’s on-field performance
  • The team performance charts shown in this article also rank the performance of each team in the four components — batting, running, pitching and fielding
  • Expected Runs values in 2018 were slightly different than those in the 2017 season

If you have any questions, comments, requests or complaints, please feel free to add them in the comments below or to email us at info@runplusminus.com

You can learn more about the RunPlusMinus™️ statistic at RunPlusMinus.com

--

--

Ivan Lukianchuk
RunPlusMinus

Entrepreneur, Metalhead, Computer Scientist. Currently CTO @RunPlusMinus — The best baseball stat. Principal Consultant at Strattenburg.