Revisiting Billy Beane’s “On Base Percentage”: Does OBP correlate with success in the modern game?

In 2002, Oakland was valued at more than half the average of the MLB. They were worth just over 40 million dollars, making them the third lowest-valued team in the league. Against all odds, the A’s scrapped their way to the top of the American League West and went on an extraordinary 20-game winning streak in the process. The A’s would face the New York Yankees in the divisional wild card game who at the time more than tripled the value of the A’s sitting at $125,928,583. The A’s would go on to lose this game in heartbreaking fashion, but what will be remembered from this season was Billy Beane’s revolutionary use of statistics to assemble a team of undervalued players that would go on to win the division. At the heart of this revolution was Beane’s use of the “on-base percentage”. He constantly emphasized the importance of players simply getting on base rather than focusing on traditional forms of player scouting. After Beane’s success in the early 2000s, many teams would adopt his philosophy of player scouting. For example, the Boston Red Socks used Beane’s recruiting tactics to help win them a World Series in 2004 to end the 86-year-long curse of the Bambino.

Using Lahman’s extensive database on MLB statistics, I wanted to assess whether On Base Percentage (OBP) actually correlates to wins in comparison to the most commonly used statistic to assess player performance: the batting average (BA). Surprisingly, the OBP statistic does not exist in the database, so I had to calculate it myself and add it to the dataset. This statistic takes into account every instance a player could get on base. OBP is calculated by dividing the sum of hits, bases on balls, and times hit by a pitch by the sum of bats, bases on balls, hit by pitch, and sacrifice flies.

I found that OBP took a steady decline from its peak in 2006 to 2015. There was then a slight resurgence shortly after. Overall, the average OBP per team has dropped roughly 10% since the peak of its popularity in the early 2000s. This would suggest that as baseball tactics have evolved in the past decade, less importance has been placed on this statistic. The results from a Pearson correlation test confirmed that total wins in a season only have a moderate correlation (0.425) with OBP across a 21-year period (2000–2021). On the other hand, the relationship between OBP and runs scored showed a significantly stronger correlation (0.600). Overall, it would seem that OBP is a better predictor of offensive output compared to overall success in the season.

Another commonly used statistic tested was Batting Average (BA). BA is calculated by dividing the number of hits by the number of at-bats. Compared to OBP this stat uses a far smaller criteria by only assessing the players hitting ability as opposed to the amount of times they get on base. Just like I did with OBP, I calculated the average BA for each team over a 21-year period (2000–2021). The trend was almost exactly the same, where the average BA has consistently declined over the past couple of decades. I then ran a correlation test for the relationship between BA and the number of wins in a season. The correlation coefficient was almost the same as OBP with a moderate correlation of 0.403. The same can be said for BA and runs scored. Similarly to the OBP correlation, the BA correlation increased to 6.23.

In conclusion, popular statistics such as OBP and BA should not be equated with success. It’s common for the media to measure a player’s success based on these stats alone, but this is far from the truth. The stats show that OBP and BA only have a moderate impact on the actual outcome of the game. In terms of offensive output, these statistics are the best to use, but for overall performance and game impact, more stats need to be considered such as pitching or defensive statistics. Billy Beane revolutionized the use of statistics, especially for OBP, but I feel that there is still a lot more to discover in the future of baseball analytics.

--

--