Best and Worst of Baseball — Batting (Part I)

Mehul Mehta
Five Guys Facts
Published in
10 min readMay 30, 2017

Today we’re going to nerd out on baseball, the OG stat-lover’s sport. As I’m sure you all remember from Moneyball, baseball is perfect for analytics — it’s a series of discrete events that can be analyzed to a depth that is hard to port to more abstract sports like football, basketball, and soccer. Before we dig into some of the specifics, I figured it would be worth describing what some of the more advanced statistics mean.

OBP: On-Base Percentage — the number of at-bats that ended with the batter on base divided by total number of at-bats. Essentially like the standard average, but walks count the same as hits. OBPs over .370 are considered great, and over .390 is excellent.

SLG: Slugging Percentage — the total number of bases a player hits divided by at-bats (excluding at-bats ending in walks). So if in 10 at-bats, a player hit 2 singles (1 base each), a double (2), and a home-run (4), he would have an SLG of 8/10 or 0.800. SLGs over .450 are good, and anything above .500 is amazing.

OPS: On-Base plus Slugging — OBP + SLG. A somewhat simple way to take a holistic view of batting. It rewards players that are hit/walk machines (OBP) and players that are maybe not hitting all that often, but when they are, it’s flying out of the park. OPS above .800 is considered good, and 0.900+ OPS seasons are Hall-of-Fame caliber.

ISO: Isolated Power — this stat is calculated as the difference between SLG and a normal AVG. It’s meant to be a crude measure of how much raw power a player has. ISOs over .200 are crazy, and over .250 is Bonds-ian.

wOBA: Weighted On-Base Average — a more refined way of accomplishing the goal of OPS to be a full view of a batter’s value. Each outcome at the plate is given a linear weight that is meant to capture the value each outcome has towards generating runs for your team. It addresses a core issue of OBP and SLG — OBP weights singles, doubles, triples, HRs, and walks all the same. That’s clearly not right, and understates the impact of a power-hitter relatively to a speedy infield grounder type guy. But SLG overcorrects — it says a double is 2x more valuable than a single/walk. Is that true? To that end, is a single worth the same as a walk? A walk only advances runners by force, whereas a single would often score a runner on second. And really — is a home run worth 4x a single/walk?

So basically some really smart baseball statisticians ran the math on how impactful each outcome was on generating runs. Then, to calculate wOBA, you multiply these values by the outcomes a batter generated and divide by the number of at-bats. You are then left with a figure that essentially states how many runs a player generates per at-bat for his team. wOBA’s above .370 are great, and above 0.400 is excellent.

Side note: wOBA is a great way to see the impact of the steroid era. Look at this graph:

wRC+ — Weighted Runs Created+. The holy grail of advanced baseball batting statistics. There is a great explanation here, but it’s really quite a complicated calculation. The gist of it is that it’s meant to express how valuable a player has been regarding run creation on an intuitive scale, using wOBA as the baseline. The average wRC+ is 100. Any point above or below 100 signifies the percentage above or below the leaguewide historical average a batter was in generating runs. So a wRC+ of 150 means that a player generated 50% more runs than the average player would have in the same number of at-bats. The best parts of this stat are the adjustments it includes. A player’s wRC+ is adjusted both for the time-period and the parks that their at-bats happen in.

One interesting thing in baseball is that it’s the one major sport without fully standard specifications for field size. So you have to account for the fact that what would be a routine fly ball at AT&T Park in SF is a home run in Yankee Stadium, so as to not bias your view of Giants hitters vs. Yankees hitters over their careers. This stat does that.

Side note: the degree of difference is staggering. At Chase Field in Arizona, for example, you are 1.7 times as likely to hit a home run there vs. the average other park. On the flip side, you are only 0.5 times as likely to hit a home run in Citi Field in New York than an average park.

Okay now that we have those basics out of the way, let’s dig in.

Best Hitters In History

Lets look at this a few different ways.

Top 5 (plus other interesting ones) by OPS:

  1. Babe Ruth (1.164)
  2. Ted Williams (1.116)
  3. Lou Gehrig (1.080)
  4. Barry Bonds (1.051)
  5. Jimmie Foxx (1.038)

9. Mark McGwire (0.982)

13. Mike Trout (0.975)

16. Joey Votto (0.961)

17. Albert Pujols (0.959)

19. Miguel Cabrera (0.958)

22. Gary Sanchez (0.951) (through less than 10% of PA than the rest, however)

Top 5 by wOBA:

  1. Babe Ruth (0.513)
  2. Ted Williams (0.493)
  3. Lou Gehrig (0.477)
  4. Jimmie Foxx (0.460)
  5. Rogers Hornsby (0.459)

13. Barry Bonds (0.435)

36. Mike Trout (0.413)

43. Joey Votto (0.410)

58. Jackie Robinson (0.406)

Top 5 by wRC+:

  1. Babe Ruth (197)
  2. Ted Williams (188)
  3. Lou Gehrig (173)
  4. Rogers Hornsby (173)
  5. Barry Bonds (173)
  6. Mike Trout (170)

12. Mark McGwire (157)

15. Joey Votto (157)

22. Gary Sanchez (153)

31. Aaron Judge (151) (Same Gary Sanchez caveat)

Interesting lists huh? First things first — Babe Ruth is as good as we’ve all been led to think. Notice how clustered wRC+ gets after spot #3. Babe Ruth was generating 24% more runs than Lou Gehrig at #3, which is just absolutely stunning. That’s the same difference between Lou Gehrig and Jeff Bagwell, or the same difference between Mark McGwire and David Wright. No matter how you slice it — the Babe was the GOAT.

Even more interesting to me is that while Barry Bonds was obviously aided by some illicit substances, he still doesn’t match up to the greats of old, Ruth, Williams, and Gehrig. We’ll revisit this later.

Before looking at these lists, I had no idea just how good Joey Votto has been. I knew he was a superstar right now, but I didn’t realize it was to this degree. By this metric, he’s been 4 percent better than the all-time greats Hank Aaron and Joe DiMaggio. Crazy. Makes articles that ask questions about his HOF candidacy like this one seem a little foolish. (Side note: that article is actually incredible — the author crafts a pretty rigorous machine-learning model to judge hall of fame candidacy. If that’s interesting to you at all, I highly recommend the read.)

And we’d be remiss to not look at the unbelievable pace Mike Trout is on. Around 40% of the way through his career (with seemingly some of his best baseball in front of him), he’s breathing down the roided-up neck of Bonds, and could likely claim the #3 spot behind Ruth and Williams in the next few years. It makes sense that teams are thinking about doing crazy shit like intentionally walking him with the bases loaded.

Finally, the Baby Bombers of the Bronx are doing some truly unfathomable stuff through the first 5% of their careers. Obviously we’ll see where they go from here, but if Sanchez and Judge are getting better (Judge, especially, cut his strikeout rate in half from last year to this year), we could have some all-timers on our hands.

Let’s go to the flip-side — who has been the worst hitter in history (min. 1000 at-bats)?

Bottom 5 by wRC+:

  1. Bill Bergen (22)
  2. Luis Gomez (36)
  3. Mario Mendoza (38)
  4. Del Young (39)
  5. Mick Kelleher (40)

In the words of the New York Times about chart-topper Bill Bergen, “if you are going to be bad at something, be spectacularly bad.” This feels like good advice for our dear friend Maurice Flitcroft.

Bergen was a catcher for the Brooklyn Superbas back in the 1900s. He holds several records of futility — the lowest single season batting average (0.139), the lowest career average (0.170), and the longest hitless streak of at-bats (46). He hit two home runs in his 11 year career, and had an ISO of 0.031, which is honestly unbelievable. He played so long because he was a superb defensive catcher, and in the dead-ball era he played in, a plurality of runs were scored via bunts and stolen bases. His rocket arm was apparently much more valuable than the black hole he created in the lineup. In the words of Brian Dorsey describing Davis Treybig in the WBODC 2017, he was, quite simply, “an automatic out.”

He was actually phenomenal behind the plate though — he is in the top 20 of catcher assists in history, and he once threw out six people stealing bases in a single game. The best anecdote about him describes a game in 1904:

His (strong) suit is his wonderful throwing. While playing in the interstate league with Fort Wayne, Ind., Bergen saved the game for his team one day when the bases were full and no one out by catching three men napping (while leading off), one after the other, allowing his team to win.

Amazingly, Bergen was 14 percent worse than the second worst batter of all time, Luis Gomez.

The other notable player on the list is Mario Mendoza. His batting average always hovered around 0.200, and he played in a time (1980s) that was much better covered than Bergen’s days. One of his teammates was making fun of another one, telling him he better watch out or he “might fall below the Mendoza line.” Ever since then, anyone batting below 0.200 in a season is shamefully branded as below the Mendoza line. It has since become a famous part of popular culture, and has an entire wikipedia page about it. Other uses include:

  • “I don’t think you could find any other figure in politics who has run this far below the Mendoza line and still managed to get taken seriously as a presidential candidate.” — a scathing indictment of Mitt Romney’s political abilities in 2011.
  • “Republican pollster Neil Newhouse… argues that these numbers have crossed below the political ‘Mendoza line’…” — a similarly brutal take on George Bush’s polling numbers 3 years after his Presidency ended
  • From one of our OG shows — HIMYM:

Now I said above we’d return to the Bonds question. Obviously we must view every Bonds statistic with a big fat asterisk, but it was interesting to me that his numbers didn’t stack up with the all-timers given all the hoopla about his run in the early 2000s. Maybe this is a result of the very PED question — he went from skinny Bonds to Hulk Bonds over his career, so maybe the natural days are weighing down his career stats a bit.

10 year difference — wowza

So let’s look at the top 10 single seasons in MLB history:

  1. Barry Bonds (2002) — 244 wRC+
  2. Babe Ruth (1920) — 239
  3. Barry Bonds (2001) — 235
  4. Barry Bonds (2004) — 233
  5. Babe Ruth (1923) — 231
  6. Babe Ruth (1921) — 224
  7. Ted Williams (1957) — 223
  8. Rogers Hornsby (1924) — 221
  9. Ted Williams (1941) — 221
  10. Mickey Mantle (1957) — 217

This is absurd. Bonds was 37 years old in 2002, and he led the league in walks, batting average, on-base percentage, slugging, and OPS. He was slugging 0.799 (keep in mind the mark for “excellent” was 0.500. 2001, at age 36, was the year he hit 73 home runs, and from 2001–2004, his average OBP/SLG/OPS line was 0.559/0.806/1.365. Bonkers.

Maybe the craziest stat of all of this — in the 2004 season, Barry Bonds was intentionally walked 120 times. The next closest figure for a full season from a player not named Barry Bonds — Willie McCovey with 45 in 1969. Bonds was walked 37.6% of the time he stepped up to the plate — the next highest non-Bonds figure is Ted Williams in 1954 with 25.9%. In 2004, if Bonds had never been intentionally walked, he was on pace to hit 15 more home runs.

Skinny Bonds was also pretty awesome — his OBP/SLG/OPS in 1990–1993 was 0.432/0.595/1.028 and the mark for an excellent batter is something like 0.400/0.500/0.900. But still — Bonds at his peak was unlike anything we’ve ever seen.

Well, except for Babe Ruth. His 1920–1923 line was 0.506/0.782/1.288. Not quite Bonds-ian, but also… not on steroids.

There are other crazy things to note on here. Ted Williams had two of the top 10 seasons of all time — and they were separated by 16 years. Ages 22 and 38. He also didn’t play three seasons (1943–1945) because he was a badass and a patriot and was serving in the military for the US in WWII.

Finally, a quick return to Mike Trout. About a third of the way through the 2017 season, he is on pace for the 11th best season in history by wRC+. Let’s see if he keeps the pace up, but he’s on track for something special.

To be continued…

Sources:

--

--