Geek Culture
Published in

Geek Culture


By: Dara Tan (Project Lead), Jeffrey Yang, Giselle Kurniawan, Austin Pham, Natanael Wijaya

Analyzing Different Music Charts with the Spotify API


As music and the music industry have evolved over the last half-century, so too have the measures used to quantify the success of a song. For instance, Billboard charts, which have been in existence since July 1940, once ranked songs based on record and sheet sales, disk jockey, and jukebox performances, but have since been updated to base rankings on sales, radio airplay, digital downloads and streaming activity instead.

As a group, we were curious about how charts which are based on different metrics differ. To this end, we chose to focus our analysis on three sources: information on US and Canadian radio airplay from Mediabase, Billboard Year-end Hot 100 charts and yearly ‘Top Hits’ charts on Spotify. We also used the Spotify API to pull the audio features of the songs on these charts.

Audio Features

Spotify’s API provides a list of audio features such as acousticness, energy and speechiness which offer insight into a song’s characteristics. Before delving into our analysis, let us first take a look at a correlation matrix to get a better understanding of the audio features:

From the correlation matrix, we observe that:

  • The strongest positive correlation, with a correlation value of 0.75, is observed between energy, which is the ‘perceptual measure of intensity and activity’ and loudness, which is the ‘overall loudness of a track in decibels’.
  • The strongest negative correlation, with a correlation value of -0.52, is observed between acousticness, which is a ‘confidence measure from 0.0 to 1.0 of whether the track is acoustic’ and energy.
  • Valence, which describes the ‘musical positiveness conveyed by a track’, is positively correlated with danceability, which describes ‘how suitable a track is for dancing’ and energy.

Mediabase: Radio Airplay

To begin our analysis, let us look at the radio airplay data that we were able to obtain from Mediabase. A total of 128,000 songs were listed as having ever been spinned on radio, but we chose to focus only on records with over 100 spins, resulting in a dataset of 17,000 songs.

With this dataset, we first looked at how spins were distributed among various labels. The chart above shows labels which have accumulated at least 250,000 spins, with larger and darker rectangles corresponding to labels with more spins. This chart is telling of the music landscape at large. As we observe from the chart, the three music giants, who, combined, occupy well over half of the market share, are:

  • Universal, which includes comprised of Republic, Interscope, Capitol, Geffen, Def Jam, MCA, Mercury, A&M and Island;
  • Sony, which includes Columbia, RCA, Epic and Artista; and
  • Warner, which includes Atlantic and Electra.

Next, we looked at how the number of spins varied over the years. As we can see from the chart above, there has been an exponential increase in the number of spins, with the greatest increase coming after 2015. Given that we only accounted for records with over 100 spins, this could be a result of either an increase in radio stations and coverage, or a consolidation of top played records in recent years.

Finally, we looked at how the audio features of songs played on the radio have changed over the years. In this regard, we observed the most significant trends for loudness, valence, danceability and speechiness. As shown from the chart above, songs played on the radio have gotten louder, more negative, more danceable and more speechy over time.

Billboard/Spotify Comparison

Having looked at the trends in our radio airplay dataset, let us move on to our Billboard and Spotify datasets. For both of these charts, we chose to look at the rankings from 2011 to 2020. Below are some interesting findings we obtained by comparing the two charts.


When comparing the songs on the two charts, one audio feature that caught our attention was valence. As can be seen from the graph above, the mean valence for both Billboard and Spotify decreased initially, but in the last three years, the mean valence for Spotify has seen an uptick while that for Billboard has remained roughly constant. Since the valence of a song indicates how positive it is, this suggests that in recent years, Spotify listeners have begun favoring more positive and upbeat songs, relative to radio airplay and sales, which Billboard charts also take into account. Additionally, we observe that Billboard consistently has a larger variance than Spotify, suggesting that the sentiments expressed in Billboard hits tend to be more varied than those expressed by Spotify hits.

Another audio feature that we would like to highlight is duration. Some music analysts, for instance Charlie Harding and Nate Sloan from the music podcast ‘Switched on Pop’, have commented that the rise of streaming as a ‘dominant force of distributing music’ has led to a trend of shorter songs due to changes in pay structures, wherein artists are now paid a fixed rate for each stream that lasts at least 30 seconds, as opposed to per song, on streaming platforms. Our graph above seems to both (a) confirm this idea and (b) suggest that this effect became most prominent around 2017. Furthermore, it is interesting to note that both Billboard and Spotify showed similar trends in this chart, suggesting that the impact of streaming platform pay structures on song length has permeated charts that are not purely based on streaming, such as Billboard, as well.


Aside from audio features, we were also interested in comparing the artists whose songs made it onto the two charts. To this end, we first compared the proportion of ‘new’ artists per chart, per year. For this analysis, we considered an artist to be ‘new’ to a chart in a specific year if they had yet to have a song on that chart in previous years of our dataset.

From the graph above, we see that until 2017, the proportion of new artists per year seems relatively comparable between the two charts, but thereafter the proportion of new artists per year on the Billboard chart seems to stagnate while that on the Spotify chart increases and is noticeably higher than the Billboard chart for the remaining three years. This could suggest that in recent years, it has been easier to get a big ‘break’ on streaming platforms than it is through other means such as radio airplay and sales, which Billboard also takes into account.

Next, we looked at the top artists by entries on both charts. In the graphs above, the ‘mean’ records the average number of songs on a chart that an artist is part of per year, while the ‘sum’ records the total number of songs on a chart that an artist is part of over the 10-year period.

We observe that overall, the Spotify charts seem to be more evenly-distributed, with the largest ‘mean’ value being about 4 and the largest ‘sum’ value being about 25, as opposed to approximately 6 and 50 for the Billboard charts. This seems to build on our observation from our previous graph that charts based purely on streaming activity might be easier to break into, by suggesting that even among the top artists, there is less of a monopoly by a single artist on Spotify as compared to Billboard.

Furthermore, we observe that the ‘mean’ graphs for both charts are dominated by rappers. This makes sense, since rappers tend to be featured on other artists’ hits on top of making their own hits and thus are more likely to churn out more hits per year. However, we do note that the Spotify charts seem to be less dominated by rappers than the Billboard charts.

Finally, we looked at the proportion of collaborations among the top artists. For this analysis, we first honed in on the top 10 artists per chart based on the total number of hits that each artist was involved in. We then labeled each hit as a collaboration (‘TRUE’) if more than one artist was listed on that hit and not a collaboration (‘FALSE’) otherwise.

From our graphs above, we see a rather substantial overlap of top artists, with seven artists — namely Drake, Rihanna, Nicki Minaj, Justin Bieber, Ariana Grande, Chris Brown and Taylor Swift — being listed in the top 10 of both charts.

We also observe that Taylor Swift is the only artist of these seven to have made the top 10 of both charts without a majority of her hits being collaborations. As shown from the table above, only about 23% of her Billboard hits and 16% of her Spotify hits were collaborations, as compared to about 55% of Billboard hits and 58% of Spotify hits for Ariana Grande, who has the next lowest proportion of collaborations.

Song Recommendation

To conclude our analysis, we explored the use of audio features to recommend songs. To do so, we created a simple function which takes in the input of a song URL on Spotify, extracts the audio features of the song, then finds its nearest neighbors based on these features from a dataset of songs which we have compiled from multiple artists and recommends the top songs which are similar to the one inputted. Below are two examples of the output of our function.

To test our function with other input songs, visit our GitHub repository here.


Overall, our analysis yielded some interesting results. Our work with Mediabase highlighted some key trends in radio airplays, while our comparison of Billboard and Spotify charts over the past decade, both in terms of the audio features of hit songs and the artists whose songs make it onto these charts, brought out some key differences in streaming and more traditional forms of music. We were also able to utilize the k-nearest neighbor algorithm to identify songs that have similar characteristics to one another. All in all, our analysis provided insight into some interesting trends that have arisen across various music charts and metrics today.



A new tech publication by Start it up (

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store