Are the Chicago Cubs really the ‘lovable losers’?
108 years is a long time. In fact, it is the record held by the Chicago Cubs for the longest championship drought in all the major sports leagues in the United States. My grandmother lived her whole life never seeing her beloved Cubbies win a World Series. I assumed my life would be the same. The Billy Goat Curse. Steroids. Steve Bartman. Injuries. Bad Trades. Cheap Owners. We were able to generate more excuses for losing than we were wins. Then in 2016 the ‘loveable losers’ did the unthinkable and won it all. Our curse had ended, the drought was over, and we were no longer the lovable losers; we were winners.
The more things change though, the more they stay the same. The thing about the Cubs and their fans is that we love our team win or lose. We developed the moniker of the ‘lovable losers’ because despite the results on the field, the people in the stands were having a good time. We stuck with our team no matter what. We wanted a winner, but we did not need one. When I first learned about sentiment analysis, I immediately knew I wanted to apply my new-found knowledge to this topic. My question was simple, are the Chicago Cubs really the lovable losers?
Hypothesis:
My null hypothesis is that sentiment of a team would rise and fall with the team’s win/loss record. Also if a team were to be eliminated from playoff contention, their positive sentiment would drop. My alternative hypothesis is that sentiment surrounding the Chicago Cubs would be positive regardless of record or playoff eligibility. I planned to test this hypothesis by mining tweets about the Chicago Cubs from Twitter and analyzing their sentiment.
Method:
I used SNScrape to access the Twitter API and scrape tweets from the 2022 season. I then applied the TextBlob library sentiment analysis module. It provides a simple API to perform sentiment analysis on a given text by returning a polarity score ranging from -1 to 1, where -1 indicates negative sentiment, 0 indicates neutral sentiment, and 1 indicates positive sentiment.
To establish some baselines, I chose five significant dates to break up my analysis into time frames from which to scrape tweets. As a result I had data from the beginning of the year to opening day, opening day to the All-Star break, All-Star break to the trade deadline, trade deadline to the day the Cubs were eliminated from playoff contention, and from that day until the end of the season.
I thought of a few possible search terms for my scraping including: Chicago Cubs, gocubsgo, GoCubsGo, Cubbies, FlyTheW. I settled on simply ‘Chicago’ and ‘Cubs’ because I did not want my potential search terms to skew my analysis. I wanted to keep it as general and unbiased as possible. I limited my search to 2000 tweets and in all cases was able to mine over 1000 tweets, with only my two shortest time frames (All-Star break to Trade deadline and Playoff Elimination to Season End) not hitting the maximum amount of tweets. Those two specific time frames still resulted in 1,269 and 1,227 tweets respectively. Here is a sample of my code used to scrape tweets for the End of the Season time frame:
tweets_list = []
maxTweets = 2000
# Using TwitterSearchScraper to scrape data and append tweets to list
for i,tweet in enumerate(sntwitter.TwitterSearchScraper('(Chicago and Cubs) since:2022-09-18 until:2022-10-05').get_items()):
if i>maxTweets:
break
tweets_list.append([tweet.content])
# Creating a dataframe from the tweets list above
endseason = pd.DataFrame(tweets_list, columns=['Tweets'])
Results:
The results are probably best summed up through visualizations, but those will come a bit later. First off, the team consistently had a prevailing positive sentiment throughout the season. My analysis was not simply of positive and negative sentiments but also neutral sentiment as well. Not every tweet is going to sway one way or the other, sometimes people will just tweet the score or a non-opinionated post about the team. Throughout the season the majority of all tweets were positive, meaning over 50% of all tweets were positive. From the All-Star break until the trade deadline, positive tweets dropped to 47.8% of the total, but then rose to 49.9% from the trade deadline until the Cubs were eliminated from playoffs.
Just two things I wanted to note are first, the Cubs possessed a losing record for the entire season. It was not a good year for them finishing with a 74–88 record. Second, the Cubs did not make any major moves at the trade deadline. We see in our results a rise in negative sentiment both leading up to the trade deadline and playoff elimination. Yet in the period between the trade deadline and playoff elimination we see positive sentiment rise. The increase in negative sentiment comes at the cost of neutral sentiment, which declines.
An interesting result is that the team saw its highest proportion of positive sentiment after the team was eliminated from playoff contention. This period also had the second lowest percentage of negative sentiment. As the season progressed though, the Chicago Cubs win/loss record did steadily improve. They never were able to break the .500 mark though.
I also looked at the most prevalent words in the positive and negative tweets. The number one most popular word in positive tweets was ‘win’, followed by ‘great’, ‘game’, ‘watch’, and ‘fan’. The number one negative word was ‘game’, followed by ‘team’, ‘fan’, ‘start’, and ‘mlb’. The words ‘loss’ and ‘lose’ did not appear in the top 25 words for negative tweets, meaning if those words were mentioned it was less than 20 times. More information on this can be seen in my Colab notebook shared at the end of this article.
Conclusions:
So are the Chicago Cubs really the lovable losers?
The first observation I want to note, is the Cubs maintained a 50% or above positive sentiment rate almost all of the season. Also positive sentiment always outweighed negative. The Cubs had low expectations coming into the season and a losing record almost the entire year, but at least on twitter, people remained positive about them. People are really positive about this team overall, no matter what. Positive sentiment generally outweighed both neutral and negative sentiment combined.
The second thing that sticks out is how the sentiment trended upward after the trade deadline passed and continued up despite the team being eliminated from playoff contention. This seems to support the idea that Cubs fans do not necessarily care about the team’s performance. There is a party atmosphere around the Cubs and Wrigley Field specifically though, so maybe once fans knew for sure the team was not going to do anything (playoff-wise), they were able to just sit back, drink, and enjoy the atmosphere.
The dip in positive fan sentiment could be attributed to a few things. I believe most fans were looking for the Cubs to make some more trades leading up to the deadline, but they did not make any. Fans usually want their team to improve and contend, the Cubs did not do that. The Cubs also did not make any moves for their future. Instead, they did nothing while the general consensus was that they should have done anything, either to improve now or in the future. Or it possible that the team’s poor performance over the ‘dog days of summer’ led to fans being less positive?
A big observation is one that seems to throw everything for a loop. When you compare fan sentiment and add in the the team’s record, you see the Cubs’ win percentage actually improved over the points I was tracking. So technically, the team was performing better. As a result, you would expect fan sentiment to increase as well. More wins will always mean happier fans.
One final thing stood out for me, and it was 3 words: much, fun, tonight. These were 3 words that came up in positive tweets at the end of the season that did not come up in negative tweets. Despite the Cubs losing more than winning, people were still having fun at the games. So while I would have to say my results are inconclusion, there is some evidence to support the theory that the Cubs are indeed lovable losers. Enough evidence that I am encouraged to do further research on the topic.
Comparison:
As a point of comparison, I quickly ran sentiment analysis for another team using the same time frames that I used for the Chicago Cubs. Unbeknownst to most of the world, Chicago has a second baseball franchise: the Chicago White Sox. They are generally an afterthought in the city and only seem to gain support when they are winning. Expectations for the White Sox for the 2022 season were particularly high with many pundits picking them as potential World Series contenders. They did not have a good season and ended up missing the playoffs. I did not track when the White Sox were eliminated from playoff contention, but instead kept my dates consistent. The results are visualized below. Notice the sharp decline in positive sentiment and rise in negative sentiment towards the end of the season.
Future Research:
In the future, it would make sense to track sentiment each month/week(even game by game if we can get enough data) along with win/loss record to see if there’s any correlation. I believe this would be the ultimate test of the question: Are the Cubs the lovable losers?
There may be some problems with the dates I selected to do my analysis. While they are major points in the season, they aren’t uniform and could be including other minor events that took place during those larger time frames. Segmenting the analysis further would make the most sense, while still noting which events took place during that time frame.
Also, it would make sense to track multiple years worth of data in which the team had varied levels of success. Then look at the difference between fan sentiment in highly successful years vs. those that were extremely unsuccessful. Expanding the search terms to include more tweets would also make sense, but we have to be careful not to include unrelated tweets about say the cub scouts or actual bear cubs.
It would also make sense to do the same some of analysis for other MLB teams and compare that to the Cubs. Find a team with a similar season to the Cubs and see how their fan sentiment compares to the Cubs’ fans. Do all fans react the same, sentiment-wise or are Cubs fans unique?
Final Thoughts:
Well, if you made it this far, I owe you a big thanks. I appreciate you taking the time to read my article and experience my thought process. Please feel free to reach out to me and let me know what you think! Here is a link to my Colab notebook for the project, in case you want to see all the actual code I wrote. Go Cubs!