How has Popular Music changed?

A data dive as an addendum to a final project analyzing playlist popularity by by Florian Tarlosy, Hugh Schader, and Lucas Chu for our Harvard Intro to Data Science Final Project.

10 min readDec 11, 2021

Last year, Ekta Negi et al published Data on 169,000 songs from 1921–2020 scraped from the Spotify Web. The dataset is composed of about 2000 popular songs published from each year in the last century. I analyze these audio features using histograms, or bar charts for bin counts, and track the change in metrics over time using binscatters, or scatter plots of bin means.

Spotify Audio Features

For every track on their platform, Spotify provides data for thirteen Audio Features.The Spotify Web API developer guide defines them as follows:

Danceability: Describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity.
Valence: Describes the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).
Energy: Represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale.
Tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece, and derives directly from the average beat duration.
Loudness: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks.
Speechiness: This detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.
Instrumentalness: Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”.
Liveness: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live.
Acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic.
Key: The estimated overall key of the track. Integers map to pitches using standard Pitch Class notation . E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on.
Mode: Indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.
Duration: The duration of the track in milliseconds.
Time Signature: An estimated overall time signature of a track. The time signature (meter) is a notational convention to specify how many beats are in each bar (or measure).

Histograms for your viewing pleasure

Binscatters with analysis

We see that acousticness stays at a relatively stable level of .9, which reflects technology at the time, until the 1950s, where there is a precipitous decline of over one half, all the way down to point three where it has stayed about level for the past 50 years, so (the average top) songs are about as acoustic as they were in 1980.

When we turn to an almost polar opposite, danceability, we see a very interesting story. When our data starts (there’s not enough data actually in the early 1920s), our songs are very danceable (.6). But there’s a steep decline down to a trough at 0.45 until 1950, after which there is an increase during the 1960s and 70s. Then a quick drop in the mid 1980s to 90s followed by a peak just before 2000, and another dip, although there’s a general increase throughout the century. But we do see a precipitous increase in the last 15 or so years, jumping from .55 to almost .7, so the average top song has gotten a quarter more danceable.

We see a similar crash in the 20 in 2010 where the average duration of song drops for 240,000 milliseconds, so 240 seconds or four minutes, down to 200,000 which is about three minutes and and change (20 seconds). And so, one can say that in the last decade, the average song length has decreased by 40 seconds or by 1/6.

Looking at energy that follows similar but much more continuous increase than for danceability. There’s an S curve with a small bump before an exponential increase that peters off before another bump, forming and upside down cup. So surprisingly, there has been a drop in energy in the last 10 years, which could be explained by the increase in more minor songs. There is a degree of variability that is not seen in other plots. Other plots with high variation in the beginning plot liveness or mode.

Since popular songs weren’t as explicit in the past, the indicator for explicit was 0, or False, for the average top song until 1980, with a few notable outliers. Then there’s a trinomial steep increase, gentle decrease, and steep increase. In 2020, half of the songs are explicit. There’s almost an exponential skyrocket in explicit songs.

Turning to instrumentalness in the data, we see there is a Z shaped discontinuity around 930. However, there is a very large drop and increase, close to a 40% decline, and then a 150% increase up to a 1950 peak for instrumentalness. Then there’s an exponential decay (inverse function) .5 all the way down to .05 in 2020. This huge decline is documented in music journals and worth further study.

Looking at the binscatter for key it is harder to find insights given that the keys can be clustered. Each note on the chromatic scale is a number and so we expect the mean to be around the middle of 12. And we actually see it’s a bit less than 6 given that C is mapped to 0. It is interesting that there is overall decline increase and then decline again, which may be too general to have specific insights, but could be indicative of trends in the usage of specific keys, which we can certainly track as with a histogram. C and G and D take top three.

The binscatter looks like a mountain range

Another key variable for songs is tempo. Unlike most songs, tempo falls, rises, falls, rises, and falls again over the last century. Starting at around 110 in 1930, tempo steadily increases to almost 125 before 1980 and 2010.

As pianists are well Familiar with, a key factor in music is volume. For loudness, there is a steady increase from 1920 to 2020 from negative ~18 decibels to -8. This is interesting because Spotify actually normalizes the loudness of whole playlists that are submitted today. This is done to avoid blasting listeners with some changes in loudness, which can be unpleasant. So this normalization is an adjustment to negative 14 decibels LUFS (Loudness Units with regard to the Full Scale) according to International Telecommunications Union standards from 1770. But as we can see here, the normalization is not retroactive. Old people are right: songs have gotten much louder. Since decibels follow a logarithmic decibel scale, this 10 decibel change means that recorded songs have gotten about ten times louder in the last century. Perhaps we shouldn’t extrapolate in the future too much.

Check out the dataset to find out which songs are speechy!

Speechiness, has three strange Orion’s belts of means between 1920 and 1940. Perhaps 1 in 3 songs were speechy in select years. But as we expect, most music isn’t speechy, until recently, with the exponential rise of rap in the last two decades.

Similar to speechiness, liveness detects human presence, or an audience. So, a studio recording should get a zero. We find a sharp discontinuity at 2011. It’s like a hockey stick that’s been inverted. Modern songs are not live.

Mode is the major or minor indicator for songs. So naturally, one stands for major and 0 for minor. We see there is sparse start, so there’s a lot of variation before 1950. But then it quickly becomes very tight and increases all the way to .8. So we can characterize postwar pop as being very positive music, surprisingly. And then a sharp decline increase again around 1990 plateauing just before 2020.

For positivity, we can also look at valence. Our binscatter has a much more interpretable story, with a trough at 1950. This suggests we can have major songs that aren’t necessarily positive. Music got 1/3 sadder and jumped right back up to where it started over 60 years. Optimistically, the increase was much faster. However, our story isn’t over. in 1980, there’s a precipitious fall before a plateau until 200, when the fall continues all the way to .4 again. Luckily, looking at the last three years, we could be in the start of another positivity rebound.

Finally, when we look at popularity, we see there’s an almost linear increase, which shows there’s present bias in the Spotify data. Modern songs are more popular. Naturally, more recent songs will be played more first to be more popular at the time of the data snapshot. But this may also be a function of younger listeners being more active on Spotify.

How do these variables impact the popularity of the song?

Good question! Luckily, this analysis partly answers the question, since we’re looking at the most popular songs over time. A more in depth analysis awaits in a future article!

How do these variables impact the popularity of playlists that they are in?

We originally looked at the Million Playlist Dataset, and we saw that 3/4 of all playlists have only 1 follower, leading to charts like these:

Nevertheless, in our final project, we predict popularity of playlists using clustering, polynomials, logistics, random forests, and boosting algorithms: https://docs.google.com/document/d/1aiqPGLhWs23gOrQnAjeqwVZ27RDDJCoBJXp_QyYhD78/edit?usp=sharing

We struggle with regressions, but with classification we attain an accuracy of 99% and ROC of .74 using a gradient boosting algorithm with threshold 8. My favorite model is to just say that every playlist has 1 follower, which falls victim to the accuracy paradox.

Concluding thoughts

For many of these graphs, we see a change in the first derivative every 25 years, or a generation. Many times, these changes, from peak to trough, represent a quarter change. So for a few graphs, we could say there’s a quarter change every quarter century. Looking at 1950, 2000, and 2020, perhaps these changes, and songs themselves, are indicators of social and economic outlooks.

Through these binscatters, we see that in the last decade, songs got faster, shorter, louder, less acoustic and more danceable, more speechy, but less live, less, but mostly major, a little bit sad, and a lot more popular. As the people change, so does the music. But luckily, we have the joy of looking back in time to see the best of the past, which stays with us.

Hope you enjoyed this barrage of histograms and binscatters, and learned more about how music has changed.

How do you think music will change in the future?

How has Popular Music changed?

A data dive as an addendum to a final project analyzing playlist popularity by by Florian Tarlosy, Hugh Schader, and Lucas Chu for our Harvard Intro to Data Science Final Project.

Spotify Audio Features

Histograms for your viewing pleasure

Binscatters with analysis

Written by Lucas