Predicting Attendance
What factors contribute to a team’s home attendance?
A nice, simple one today. When I was young, I assumed the best teams always had the most fans. But when I got a little older, I learned that the size of the market is more relevant to average attendance. But there are also other factors that may influence whether people decide to go to the game! I identified the following primary factors in an effort to predict average home attendance:
- Winning Percentage (via ESPN)
- Population in the team’s metro area (via 2010 US Census)
- Opening Day Payroll (via Spotrac)
- Runs Scored / Game and Runs Against / Game (via FanGraphs)
- Average Ticket Price (via Statista)
- Stadium Size (via ESPN)
Sticking all these into a multiple regression, we find that using these variables to predict average home attendance is a good idea! The bad news: accounting for each of these variables, no individual factor here is the smoking gun for predicting attendance. This is likely because a lot of these variables are correlated with each other. This is a problem because, for example, if runs scored per game and runs allowed per game both contribute to a team’s average attendance, they don’t both need to be in the model.
To fix this issue, let’s re-do the regression analysis with a model to predict attendance using only the three variables that were the most significant in the first try: Stadium size, average ticket price, and runs allowed per game.
Vwala! Our model now has an R-sq adjusted value of about 70% and an overall P-value of 0.000 with each individual variable’s P-value being significant at the 0.01 level.
So there you have it! According to the data, fans like a cheap date, run prevention, and a big ol’ ballpark. Take notes, Portland!
Thanks to Spencer Weisberg!