Global Happiness Predicted by Spotify Trends
By: Leah Mizrachi
About this Project
My music taste varies depending on my relative happiness.
My close friends and family can predict the state of my happiness by seeing my recently played on Spotify.
So, I was curious if this analysis can work on a larger scale: predicting the happiness level of a country based on their music trends.
If music trends are correlated with different happiness levels, the results could be beneficial to predict or diagnose issues a country is facing by inspecting the music taste of its citizens.
To begin, I downloaded the world happiness data from Kaggle. This data source included information about each country’s Happiness Score and different factors that affect it for 2015–2019.
After cleaning it and restructuring each year’s data to fit a common mold, I merged it into one big data source.
Then, I limited the countries I would analyze to only include areas where Spotify data was available since 2016.
The graph above shows the mean happiness score per country. It is clear there is a variation between the different countries. The red line represents our average happiness score.
The graph also clearly shows happiness depends a lot on region, for example, Western European countries tend to be happier than Central and Eastern European ones. To explore this relationship more, I graphed the average region happiness.
After understanding how happiness varies by region and country, I dove deeper into what factors exactly predict this “happiness index”. For this, I ran a linear model on the happiness score of countries.
The regression was very telling on what factors are the greatest influencers and predictors of happiness. The most significant ones were average life expectancy (health), freedom and the year. I was surprised that the GDP per capita (econ) was the least significant.
Once I understood how happiness varied by country, general region trends and what factors where the most significant in predicting the happiness of a country, I was ready to introduce the second aspect of the project: music.
This part proved to be harder than expected because Spotify’s API no longer has extractable on the top charts by country or provided an easy way to extract music analytics by song name. So, I had to web-scrape each Spotify top 200 chart by country and date.
I was able to collect the top charts of the last week of each year of each country. I put all that information in a data frame.
Then, I had to gather information on each song. After failed attempts to extract it from the API, I found a data set with 600,000+ songs and their analytics. So, merged and filtered those analytics with my scraped top charts. The result was a data frame with the top charts by country by year combined with the analytics of each songs.
Connection between music and happiness
First, I ran a linear model on the data base containing 30,000+ song entries by country and their happiness score. The goal was to analyze which factors of a song were the most significant in determining a countries happiness score.
The results were very telling in what song factors correlated with happiness: the number of streams, the popularity level, the song duration, the level of explicit content, danceability, the loudness, the acousticness, the tempo and the streams per capita.
So, I plotted some of the these factors individually to see the trends between them and happiness. The first graph includes all observations to find the general trend of the data. The second graph gives a more summarized view, plotting the average happiness by region vs each factor.
Streams per capita vs. Happiness:
We see a clear positive correlation between the number of streams per capita and happiness.
This implies that the more music a country listens to, the happier it tends to be. This makes sense because listening to music has been proven to have positive psychological benefits.
Beyond that, streams per capita also can be reflective of a country’s economic situation. The more streams it has per person, the more percentage of the population is economically secure enough to subscribe to Spotify.
One of the factors affecting the happiness index of the country is the GDP per capita. So, it makes sense that the more Spotify usage in a country, the higher the happiness index. However, the analysis on happiness showed that income was not the most important component of happiness.
Popularity of song vs. Happiness:
If we look at each individual song, we see a negative correlation between the global popularity of a song and the happiness of the country listening to it. This could be explained if a country is listening mostly to popular songs and following the global trends, maybe its people are not unique or inclined to explore their own interests.
If people are listening to mostly “niche” songs, it can reflect an increase in the freedom of the country, which was one of the most significant factors affecting happiness. This is because each person is more encouraged to have their own tastes and preferences.
However, by region, it is harder to see this correlation between freedom and happiness and the points seem more random.
Duration of song vs. Happiness:
Both the overall regression and the plot by region show a negative correlation between song duration and happiness.
While this correlation is more difficult to explain, one possible reason could be that countries that listen to shorter songs have more are listening to “better” music.
A recent study proved that recently, some of the best songs are getting shorter and that the median length of billboard 100 songs are decreasing.
The same study also showed that the median length of a song also varies by genre. So, the duration of the songs might be indicating different music genres each country is listening to, which might be the real factor correlated with a country’s happiness.
Danceability of Song vs. Happiness:
If we analyze the general trend we see that happiness decreases as danceability increases, which seems counterintuitive. However, when we analyze each region separately, it is harder to see this trend. All regions, expect Latin America fall in a similar range of danceability.
So, this can tell us that danceability is be more of a cultural factor, and it affects the music of some regions more strongly, like Latin America.
So, it would make sense that danceability can help predict the happiness, because if we see a very high danceability range, we might infer that the happiness level is similar to that of Latin America countries.
Loudness vs. Happiness:
Loudness is also very similar to danceability, where overall we see a negative correlation between louder songs and happiness. We also see Latin America as the outlier (just like danceability).
A recent study showed that listening to loud music can help relief stress. So it can make sense that in countries with lower happiness indexes, people can resort to loud music to relax them from their daily struggles. The loud music might relate with their frustrations.
Acousticness vs. Happiness:
Overall, there is a slight positive correlation between the accousticness of music and the happiness of the country. This could be because people in happier countries might be calmer and thus enjoy calmer music.
It is interesting how when we look at specific regions vs accousticness, we see a very spread out graph. This could mean the level of accousticness varies a lot by the region of the country, not necessarily only because of happiness.
Building a prediction models for happiness:
Using the database with the top 30,000+ songs across the world, I built a prediction model to analyze if the country that listened to it had a happiness above or below the mean.
In this binary predictor, a “1” would signify the machine associates the music with an above aggregate happiness country. A “0” would signify the machine associates the music with a below average happiness country.
- If all predictions were filled with with “1” the base accuracy was 57%
- However, if we used the prediction model to assign 0s and 1s based on music taste, the accuracy level increased to 95.67%
The average between the 0s and 1s that a country received on their top 600 songs matched their if they were above or below average accurately 94.4% of the time!
Since the accuracy of the binary predictor was so high, I wanted to add further complexity to to model to allow it to predict in what quartile of happiness a country would fall based on its music taste.
The results where also very significant:
- If all predictions were filled with with “1” the base accuracy was 25%
- However, if we used the prediction model to assign values of 1, 2, 3, 4 based on music taste, the accuracy level increased to 50.4%
Then, I wanted to build a model using the Naive Bayes algorithm, which resulted in an accuracy level of 57% with the testing data!
The average ranking from 1–4 that a country received on their top 600 songs matched their quantile accurately 59.94% of the time!
Music was shown to be a powerful predictor of the overall happiness of a country. We can understand a lot about a population based on what music they are listening to.
The results of this analysis left me more curious about other applications of how music can speak about global or independent issues. For example, could music trends be used to predict who will win an election? Pre-diagnose psychological disorders? The applications seem endless.
Policymakers might benefit from analyzing their countries music trends to better understand the population they are the leaders for.