RunPlusMinus™ & Wins Above Replacement Comparison

Published in

RunPlusMinus

4 min readFeb 13, 2019

How does the newcomer RPM compare and differ with WAR?

This article summarizes the main similarities and differences between the RunPlusMinus (RPM) methodology and the Wins Above Replacement (WAR) methodology. We assume the reader has read the description of the assumptions and logic underlying the RunPlusMinus methodology found here. (The last section of this article gives a brief history its development)

Several WAR implementations (fWAR, bWAR, rWAR, openWAR) exist that have numerous commonalities and many differences. A chart of the functionality included in each significant implementation of WAR can be found here. WAR implementations have existed for many years. As such, there have been many public articles and presentations that discuss the strengths and weakness of the WAR methodology. RunPlusMinus is the new kid on the block and while a great deal of information is available through its website, it has not yet received the same public scrutiny.

Similarities

(* denotes some minor differences between the two)

Purpose: To calculate numeric values that measure the overall value of a player. Focus is on the outcome of player efforts, not why the outcome resulted
Input data: Primarily detailed play-by-play data of all MLB games
Park effects included in performance evaluations *
Calculates player contributions in runs *
Values for a team are the sum of participating player values
Graphs of WAR values and RPM ratings have a bell shape with mean value zero
Proof of quality: both are highly correlated with team wins
Both eliminate the inherent biases of many traditional baseball statistics

Data excluded:

Information such as age, experience, injuries, environmental descriptors such as day/night, playing field surface, schedule density, contract status, momentum, etc.
Highly granular data available from Statcast such a pitch speed, length of lead-offs, deployment, etc.

Differences

There are numerous differences between the two methodologies. These have been partitioned into two groups — significant and non-significant differences

Significant Differences

The concept of a replacement player is fundamental to WAR; RPM evaluates the on-field performance of MLB players in a given season.
WAR produces separate WAR values for batters (includes running contributions), pitchers and fielding performance; offense and defense WAR values cannot be added to give a single measure of overall performance; RPM calculates normalized performance ratings for each of batting, running, pitching and fielding. These ratings can be added to give a single measure of overall player performance.
RPM assigns a responsibility value to each player involved in every play;
RPM values are calculated for every player in every play in every game; WAR values do not exist at the play level.
WAR uses many subjective estimates and weighting factors in formulas; RPM uses far fewer; parameters can be modified by a team.
WAR’s unit of performance is measured in wins; RPM player ratings are in runs
WAR calculations are much more complex than RPM. (Explanations of WAR frameworks probably run 200+ pages.)
WAR calculations embody several aggregate statistics; RPM uses only averages and standard deviations
RPM reporting options allow a user to select arbitrary subsets of player and games for analysis

RPM ratings can be used to:

Compare offense and defense players on a single scale
Predict game winners
Support in-game decision making
Compare performance-based salaries with actual salaries

Less Significant

RPM evaluates the consequences of Field Manager decisions — e.g. intentional walks — and calculates RPM Ratings for each
WAR implementations differ in many ways (see chart here); RunPlusMinus has only one implementation
Consequences of errors are calculated differently
Effect of number of games played, plate appearances and batters faced is handled differently

Requirements for an Ideal Metric of Player Performance

The foundations of the RunPlusMinus methodology were first postulated in the late 1980s by John B, Moore, then a professor of Management Science at the University of Waterloo. The “ah-ha” moment came with the realization that modeling each half inning of a baseball game as a Markov chain of events would allow measuring player performance without the biases associated with the multitude of traditional statistics (Wikipedia describes over 120 stats.)

With the advent of publicly available play-by-play data, it became feasible to implement the RPM methodology. Along with the implementation came the realization that the ideal metric of player performance must satisfy the following 5 criteria which are easily remembers via the acronym CRAZI. The RunPlusMinus methodology satisfies all criteria as explained below.

Comprehensive: Input data incorporates and evaluates every player’s performance in every play
Run-based: The unit of performance is “runs”. The offense team’s objective in every play is to produce runs and/or increase the potential to produce runs.
Additive: RPM values are additive. This allows performance of individual players or groups of players performances to be compared in user-definable subsets of plays. Team performances are the summation of player performances
Zero-sum: Offense + Defensive RPM totals equal zero in every play and every game. Average player performance and team totals across all games is zero.
Independence: Individual player performances in each play are independent of events that led to the starting state of the play or a player’s position in the lineup. Offense and Defense players can be compared using the same metric.

WAR implementations satisfy some of the criteria to a greater or lesser degree as described below.

WAR is comprehensive at the aggregate level but does not use the same level of granularity when calculating player ratings.
WAR values are in wins rather than runs. Wins are calculated using a heuristic which converts runs to wins.
Team WAR values are totals of player WAR values. However, adding offense and defense player WAR values lacks validity as a measure of total value.
WAR values are zero-sum for position-based, pitching and fielding values.
WAR values use estimates of replacement player values and are not like RunPlusMinus ratings which are entirely dependent on players on-field performances.

Conclusions

WAR values and RunPlusMinus ratings are valuable metrics for evaluating on-field player performances
RunPlusMinus values measure a player’s performance relative to all other MLB players performances; WAR modifies actual performance values with the performance value of a potential replacement for each MLB player
RunPlusMinus ratings have several important applications other than simply calculating a number that measures total player value.