Comparing the Music Tastes of Different Countries

Justin Chen
The Startup
Published in
4 min readJun 17, 2020

I recently discovered through reddit that Spotify has a wealth of data available for public access. Relevant to today, every song in their library is assigned values according to its levels of energy, acousticness, and host of other musical features. Through comparing these features across the listening trends of different countries, we can investigate global music taste as detected by the Spotify algorithms.

For this comparison, I looked at the following 5 features as described by Spotify:

Danceability: How suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity.

Energy: A perceptual measure of intensity and activity. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

Acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.

Speechiness: A measure of the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.

Valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track.

For a sample of the music tastes of each country, I took Spotify’s Top 50 playlist for each country, and averaged the features of all the tracks in the playlist. For example, here’s the data for the US:

Since Top 50 playlists are filled mostly with pop, the danceability and energy levels are universally quite high. In the US, the prevalence of electronic/sythesizer beats means that the acousticness and speechiness values are low. However, this is not the case for every country. Overlaying the data for Vietnam, we can see that Vietnamese pop is slightly less danceable, but much more acoustic than American pop.

Measuring Distance

Now to introduce a measure of how close or far countries’ musical tastes are from each other. Since the data for each country exists in 5D, I used the Euclidean distance between the points and took a logarithm by 1/e to fit them on a graph.

Running the numbers for Spain, the first thing which stands out is a clear gap between Spanish-speaking countries and the rest of the world. It turns out that Spanish-speaking countries all have very similar song tastes.

Looking back at the similarity data for Spain, we can see a smaller drop-off towards tail end of the graph, signifying the contrast between the music of Spain and those of East and South Asia. Indeed, putting all the countries into a graph and running a clustering algorithm splits global music taste into three clusters.

(Inspired by this post, which clusters by shared tracks. Compare the two graphs!)

Roughly speaking, the three clusters are:

  • The Spanish-speaking countries, whose music is highly danceable
  • The Asian countries, whose playlists are less boisterous and more acoustic
  • Everyone else.

As we previously observed, the Spanish-speaking countries form a close-knit clique among themselves. The Asian countries, on the other hand, are much farther from each other. Unlike the Spanish countries, they don’t share a language, culture, or history under any sort of unifying empire; this diversity shows in the graph.

Interestingly enough, Israel’s Top 50 Playlist, which consists of a combination of global pop and Hebrew tracks, fits more into the Asian cluster than the global one. On the other hand, the upbeat, high-energy music of Japan puts it into the global cluster, albeit very far from every other country in the sample.

Some other surprises include secret friendships:

And the fact that Jpop is quite good (this is not related to the data). If you happen to live in Belgium, give New Zealand’s music a try! For Americans, go listen to the music of Luxembourg or, if you want something different, try the music of Hong Kong and Vietnam. If nothing else, it’ll be a fresh experience.

Made using the Spotify dev API, matplot, and Gephi. Code used here.

--

--