Your Top 100 Songs 2020 in Python and Plotly

I look into the “Your Top Songs 2020” playlist generated by Spotify last year and put it in Python. Here’s what I found so far about my music taste.

Lia Ristiana
Analytics Vidhya
12 min readMar 4, 2021

--

Photo by Mohammad Metri on Unsplash

You probably notice that at the end of the year, your friends would post their top artists and top songs on Spotify from that year to their social media. I did. Whether you liked your friend's music taste (or thought your friends have no taste), you couldn’t help but see #SpotifyWrapped everywhere.

In case you’ve never used Spotify and are not familiar with #SpotifyWrapped, basically, it’s a Spotify feature released at the end of each year to show you a deep dive look into your most memorable listening moments of the year. Additionally, they also generated a playlist of your most played songs throughout the year. Last year it’s called “Your Top Song 2020”.

Spotify created a playlist containing 100 most played songs throughout the year for each user.

My Spotify Wrapped

To be honest, I don’t think there is anything I can brag about the music that I listen to, other than the fact that I love the most famous band in the world and that I listen to too much sappy & mellow music. Probably. At least, that’s what my assumptions so far. And here is what Spotify discovered about me last year.

My Spotify Top Artists and Songs in 2020

It’s probably nothing surprising that The Beatles is my top artist; they have always been ever since I discovered them about five or six years ago.

What I actually found quite interesting is the fact that despite being my top artist, none of their songs occupied any of the slots on my top 5 songs. My guess is that even though I listened to The Fab Four a lot, I didn't listen to any particular song on repeat. I listened to their whole discography fair and square.

The Beatles
“Here come the pretty boys.” — The Beatles by The Bob Bonis Archive

So, enough about my #SpotifyWrapped. Let’s get to the point where I put the playlist to Python.

The Code and the Data

Just a heads up that I have put up a notebook on Github containing all the code presented here. You can download and try it yourself. The link is at the end of this article, so scroll down to the bottom.

Also, a quick note, your listening activity during December 2020 wasn’t included in Spotify Wrapped or “Your Top Songs 2020” playlist. So, even though I played a lot of evermore and McCartney III, they don’t count, since those two albums were only released in December.

Using Spotify API

I used a Spotify library called Spotipy to access their API. It’s a lightweight Python library for the Spotify Web API. With Spotipy you get full access to all of the music data provided by the Spotify platform. You can follow the steps in the documentation to install it in your environment.

Before we can start, you have to get your own CLIENT ID by going to this Spotify developer dashboard and create an application (you have to be logged in first). I named my application ‘music analysis’. If you click on it, you’ll be directed to a page where you can copy your CLIENT ID and CLIENT SECRET for your application.

Spotify for Developer Dashboard

One more thing before we can get things to work. You have to get your Spotify playlist URI, which you can get by going to your Spotify Desktop app and copying it like in the picture below.

Get the URI of the “Your Top Songs 2020” playlist from Spotify Desktop

I hope you find things easy so far. Now that we have got our client id, client secret, and our Spotify playlist URI, we can go straight to the code. Here we go!

Fetching Data

First, you’ll need to set up your variables. You have to replace the playlist URI with the URI you get before and CLIENT_ID and CLIENT_SECRET with your own.

Then this code below will fetch your playlist data. The data you get will be in JSON format, so to make things easier to read, we convert it to a pandas data frame. What you’ll see is a data frame containing one row, and it’s the data of your playlist.

Remember that it’s only your playlist data, not the data of each song in the playlist. To get that, we have to fetch each item (song/track if you want to call it) with the following code.

The result above is still in JSON, so let’s convert that to a data frame and change several of their column names for an easier read.

You can save this data frame to a CSV file if you want, so you don’t need to run the code to fetch the data from Spotify again each time you open your code.

If you want to know what data are in this data frame, you can take a look at their columns.

As you see on the screenshot above, this data frame contains information on the album of the track (album name, album artists, release date), the track itself (track duration, track name, popularity, etc), and other relevant information.

Let’s take a look at my top 10 songs here. Artists' data are a little bit trickier to get because they are still in JSON, but here are the track name and the album name.

In case you haven’t noticed, the songs are ordered by your frequency of playing. So the higher the ranking/the position of a song means that you played it more than the songs in the lower positions. In this case, I played “What If I Never Get Over You” by Lady A(first position) more than I played “More Hearts Than Mine” by Ingrid Andress(fourth position). Both are great songs by the way!

Now let’s take a look at the artists' data. Like I said before, it’s tricky because it’s in JSON.

There are two options, you can either get the artist's information from the column ‘album.artists’ or get it from the column ‘artists’. They are slightly different as you can see on the screenshot above. The song “exile” belongs to folklore album by Taylor Swift, but the song itself also features Bon Iver. That’s why from the column ‘album.artists’ you only see Taylor’s name, but in the column ‘artists’ you also see Bon Iver’s.

For simplicity reasons, I am going to get the information of the artists, but only the main artist, which is mentioned first. In the example above, that means I am going to only get Taylor Swift and leave Bon Iver. I will get information on each artist’s numbers of followers, music genres, and popularity. You do that to each artist by collecting their URI.

Now you get a neat data frame for each artist in the playlist. I will also add some new columns on information about each song’s name and their popularity score.

You probably see that there are several artists who appear on the playlist multiple times (well Taylor Swift, The Beatles and their solo, Pink Floyd, and Queen certainly do). I want a data frame that contains the artist rank and no duplicate information.

Okay, so we’ve got the data we need, now let’s do the fun parts! Visualization!

Plotly to the Rescue!

I’ve been dabbling around with Plotly at work lately, and I want to let others know how awesome this library is. Previously, I was a Matplotlib user for years and years. The shift to Plotly was quite confusing at first, but I finally got used to it. I hope you’ll love it too after trying it out yourself.

How many followers do the artists I listen to have?

My first graphic is about the number of followers my top artists have. You may wonder, do I listen to popular artists or obscure ones? Well, the answer is here.

Most of the artists I listen to have less than 5 million followers, and only a few of them have more than 10 million followers. More followers don't always equal popularity, but you kind of get the idea which artists are more popular.

How about genre distribution?

My initial guess to the question above is I probably don't listen to enough diverse music genres. I tend to stick in classic rock and country because I can be a little too soft and mellow sometimes.

You collect all genres from each artist to a list like the following code.

Here comes the graphic.

I also tried to make a word cloud using the same data.

A word cloud of my most listened to genres, ranging from classic rock, pop, to boy band and Hollywood

Talking about music genres provided by Spotify, their data scientists created a cool website that breaks down and explores each music genre. Take a look at everynoise.com.

Which artists have the most songs in my Top 100?

We already know that my number one artist is The Beatles, but how many of their songs are in my top 100 really?

Apparently, Taylor Swift rules my playlist with 16 songs! Meanwhile, The Beatles only have four songs in my playlist. But considering John Lennon and Paul McCartney both have five songs each (and not to mention Wings too), I guess it’s safe to say I really can’t run away from those boys from Liverpool, huh?

How popular are the songs?

For the next graphic, I want to find out how popular the songs from the top 10 artists I listened to.

I apparently tend to only listen to their most popular songs when it comes to Taylor Swift’s music, and I admit that I probably listened to Lover and folklore a lot more than others (evermore doesn't count because it only came out in December). For Queen and Lady A, I tend to listen to their more popular songs, though I also explore their less popular ones a bit.

However, when it comes to Fool’s Garden or Paul McCartney, I listened to a broader range of songs in terms of popularity. For example on McCartney’s music, I am sure most people have heard the masterpiece called “Maybe I’m Amazed” from McCartney I, but not many people seem to know “Waterfalls” from McCartney II (the popularity score for this song is less than 30). I know I am late to the party because I only discovered that amazing song last year!

Meanwhile, that little dot in the bottom side on Pink Floyd’s line, that’s “Julia Dream”. I didn't know that song was not really popular, as opposed to the other two dots above, “Wish You Were Here” and “Another Brick in The Wall, Pt. 2”. I found it interesting that I listened to really there was such a huge gap between the songs that I listened to from Pink Ployd.

What is My Favorite Year of Music?

Now we are going to make a chart to visualize the release year of the songs that I listened to the most.

You’ll need to get the year first from the album release date and list them.

Next, we put these data into a bar chart.

This is where I like to point out to my friends, who often accuse me of only listening to older songs, that they were wrong. Most of the songs in my Top 100 playlist are from the year 2020. In summary, even though I loved classic and oldies, I listened to a lot of new songs too.

To make it simpler, I will also make a bar chart not by years, but by decade because after all, we like to define the time by the music from that decade, right?

When you put the data by decade, you see that my top 100 songs are mostly from the 2000s onwards. I gotta tell my friends that my music taste isn't that old.

Just a little bit of a nitpick, some new remasters of songs may have the year the remaster was released instead of the original year of the song. For example, “Beautiful Night” by Paul McCartney was originally released in 1997 from the album Flaming Pie, but the song was re-released in the 2020 remaster. This particular track was listed as a 2020 song. However, this is a rare occurrence, so I can ignore it. Most of the songs from the 2010s and 2020s decades are indeed originally from those decades.

I think that’s enough code and charts I have shown you in this article. I hope you find something helpful from it.

Conclusions

What I can actually conclude from this article is that I need to listen to and explore more music genres! Spotify has been a great tool for me to discover new artists and genres, and I love that. I hope in 2021 you also get to discover many great songs and create new, great memories with them. See you in #SpotifyWrapped2021!

You can see and download the whole code I use in this article by clicking the link to my Github below.

--

--