Even more on WAR

Published in

Joe Blogs

7 min readNov 22, 2017

The last few days, my Twitter account and probably yours has been buzzing with countless thoughts on Wins Above Replacement. I must admit that I’ve not followed it as closely as I normally would — I’m into the heavy writing on my book on Houdini* and will likely descend deeper and deeper into that for the next couple of months (though there is also PLENTY of Hall of Fame stuff on the way).

*I don’t know if I mentioned it, but I’m writing a book about Harry Houdini … and why he still resonates and inspires and matters today.

Still, I have seen some things. Sean Forman talks about adding a Defensive Leaderboard to Baseball Reference, which will be awesome. He also takes offense to my suggestion that BR does not take great care in translating runs to wins in WAR (this was not exactly what I meant to say but I can see how he took it that way). Dave Cameron over at Fangraphs wrote an excellent piece trying to put WAR in contest. Jonathan Judge at Baseball Prospectus wrote about Bill and how we should handle the noise of baseball. Mitchel Lichtman went on several Tweetstorms about how little some of us understand the real use and purpose of WAR, which in my case is undoubtedly true. And then lots of people offered various opinions about WAR and what it does and what it should do and so on.

I think, overall, this is all very good. Some of the talk has been heated, and I don’t know if anyone is getting their feelings hurt on this — I hope not — but I think this is a necessary discussion to have now.

Here’s why: WAR has won. Baseball statistics come out all the time. Very few of them transcend and become a part of baseball’s language the way WAR has. It is a testament to the power of the idea of Wins Above Replacement and the excellent work of Baseball Reference and Fangraphs and Baseball Prospectus and so on. Runs Created, a magnificent invention of Bill James, never did achieve that level of power. Tom Boswell’s Total Average never did. Fielding Independent Pitching, FIP, is a fascinating and brilliant way to look at pitching, and yet it has not yet achieved that sort of tipping point.

But WAR is now a part of baseball’s mass culture. You will see WAR on scoreboards now. You will hear broadcasters talking about WAR every now and again, and you will see baseball writers include WAR in their stories without even explaining what exactly it means (the ultimate compliment to a stat). Right or not, WAR has unquestionably played a gigantic role in the awards voting of the last few years. WAR is one of the first things anyone talks about when talking about the Hall of Fame.

And with such exposure and power comes great responsibility. Alfred Nobel invented dynamite for the sole purpose of blasting rock; he had no intention for it to be used to blow open safes in bank robberies. At some point, the CORRECT USAGE for WAR is a theoretical thing. What matters is how people actually use it.

And people actually use WAR to compare players’ seasons and determine which one had the better year. People who understand WAR will tell you that’s not the right way to do it necessarily. But part of WAR’s wonder, part of the reason that it has become a beloved stat, is that it gives you a single number that pretty nimbly combines a players defense, offense and baserunning. And there’s a glorious simplicity to it.

Baseball Reference: Jose Altuve 8.3, Aaron Judge 8.1.

Fangraphs: Aaron Judge 8.2, Jose Altuve 7.5.

Ah, those numbers speak to me.

But WAR is what WAR is. As Mitchel says, WAR reflect skill and future performance and NOT past contribution to wins. I don’t know how many people who use WAR understand this, though. I like this bit in the BP piece — Judge points out that Baseball Reference WAR accounts for about 87 percent of all runs in baseball.

“That’s pretty darned good and not atypical for the various WAR systems. Does that leave 13% of what happens on a baseball field unaccounted for? It sure does. But again, so what? WAR doesn’t pretend this variance does not exist; it merely refuses to punish individual players for the inherent volatility we enjoy seeing in the game.”

That is, more or less, at the heart of what we are talking about. What do we do with that 13 percent? WAR says that the 13 percent of variance (or chance … or luck … or timing) should belong to the player. It isn’t Anthony Rendon’s fault that he was hitting sixth and so after he hit a double, he ended up stranded at second. It isn’t Joey Votto’s fault that his super-high on-base percentage often goes to waste because other Reds can’t hit. And, a bit more controversially, it isn’t Aaron Judge’s fault that so many of his home runs were hit in blowouts rather than high-leverage situations.

Bill James’ point is that while it might not be the PLAYERS FAULT — he is the one who uses the word “luck” to describe variance — it is not right to simply write that 13% of volatility out of existence when looking at a player’s season. The point, Bill says, is wins.

So how do us math-challenged mortals feel about this? Well, this is interesting, I think. Tom Tango conducted an experiment, one that I was proud to be involved in. He had four of us — me, our colleague Mike Petriello, Mitchel and Tango himself — put the exact same poll on Twitter accounts. The poll reads like this:

Now, as you can see here — my followers voted STRONGLY that wins matter. They voted strongly that when considering the Yankees and Astros players — and I think everyone can guess this is directly connected to Judge and Altuve — you have to adjust for the Astros having won more games.

Mike Petriello, who I suspect has a similar batch of followers, ran the exact same poll with the exact same wording, and got similar results. His followers voted 75–25 for wins matter.

Mitchel ran the poll; he has a smaller group of followers but they are clearly much more sabermetrically inclined than either Mike or myself. His followers went exactly the opposite way, voting 71–29 for how the players should be judged evenly because only runs matter.

And Tango, whose followers probably skew somewhere in the middle, got a somewhere in the middle result, with his followers voting 52–48 in favor of treating the players equally and ignoring the win difference.

Before going any further, I should add that Bill James ran a similar poll; in his poll, he asked whether luck that leads directly to wins and losses should be ignored because it is luck or if it must be included because it is too important to ignore.

53% said it it’s too important; 47% said you have to ignore luck.

Now, you can take from this what you want — or nothing at all — but it seems to me that what we’re seeing is a different comfort level with how statistics treat variance/luck. You could argue that the statistics we have known all our lives — batting average, slugging percentage, ERA, home runs, strikeouts — do not take into account variance/luck. And so we should be comfortable in this world.

This is true. But WAR — as I said, it has become hugely popular for a reason. People want to believe that it uses extremely complex calculation and reasoning to give us a wonder-stat, one that answers all questions and sees all worlds. Is that fair? No. But it is real. The more sabermetrically minded people are perfectly at ease with what WAR does and does not do; they are comfortable in the space where variance/luck/timing is assumed to be variance/luck/timing, and a players’ skill is highlighted.

The more casual fan, though, wants a WAR-type number (something more than Win Probability Added, I think, which has not caught on) that reflects directly on what happened during the season.

It’s a mindset. Why do people want this? Why can’t everyone be happy with what WAR does and does not do? Well, I think that many just have a harder time dropping things that happen on the field into to the variance box. Maybe it is luck. But it happened. It matters. It changes games. I think of the scene in “Crimes and Misdemeanors” where Sol, the ultra-religious father of the Martin Landau character, is asked: “What if you are wrong about God?”

Sol: “I’ll still have a better life than all those that doubt.”

“Wait a minute,” he is told. “Are you telling me that you prefer God to the truth?”

Sol: “If necessary, I will always choose God over truth.”

The people who invented and circulate WAR certainly do not have to conform their statistic to us dummies who want to believe that luck is more than luck or at least want to see luck incorporated into the stats. Of course they don’t. But they should know that there is a hunger out there. Most people who quote WAR couldn’t tell you one thing about how it works. They assume those things that sabermatricians call timing, variance and luck are already in there.

Baseball Prospectus and Tom Tango have both talked about adding something to WAR, a variance add-on if you will. I must admit I LOVE this concept. I think the ideas are similar, but I’ll talk about Tango’s idea because I might (might) understand it better. His idea is that you can fairly easily have a plus/minus number next to WAR to express timing/variance/luck/whatever you want to call it. So, for instance, if you are comfortable in the world of WAR, you don’t even have to pay attention to adjustment.

But if you want like something that does incorporate timing in there — let’s call this the MVP add-on — you could have a stat that looks like: Judge: 8.2 WAR (-1.3). The 8.2 is his WAR. The -1.3 (and I just chose that number randomly) might represent 1.3 wins lost because of poor timing, lousy teammate performance, bad luck, etc.

This would offer a little something for different kinds of fans. I don’t know. I think people might really like that.

Even more on WAR

Written by Joe Posnanski