How to get your favourite artists’ music data from Spotify using the Python library Spotipy

The name of the library couldn’t have been better chosen

Javier Canales
Analytics Vidhya
Published in
5 min readApr 23, 2020

--

How you ever wondered how many albums The Doors produced? Or wanted to know the exact date when Seargent Peppers Lonely Hearts Club Band, from the Beatles, was released? Or maybe you are just a data-music (or music-data) geek who wants to know how to retrieve music data from the Spotify API.

Whatever the reason, in this article I will make a basic tutorial of how to use Spotipy, the lightweight Python library for the Spotify Web API. Particularly, I will explain how you can create a programme to retrieve data about an artist, their albums and their songs, and put the information in pandas data frames, ready for data analysis!

1. Getting started with the Spotify API

For those who are not familiar with, an API is an acronym for Application Programming Interface, which is a software intermediary that allows two applications to interact with each other.

The Spotify API can be regarded as a ‘bridge’ between a user’s app and the Spotify platform (as shown below). Particularly, the API provides several endpoints that return JSON metadata about music artists, albums, and tracks, directly from the Spotify Data Catalogue. To know more about the API and the different types of endpoints, I highly recommend having a look at the API documentation.

Image obtained from the Spotify API documentation page

So, in order to extract music data from Spotify, first you have to set up a Spotify Developer Account and create an app. It is a very simple and quick process that you can do here.

2. Authentication setup and token

Once you set up your account and your app, you will have to get your credentials in order to connect with the Spotify platform. In particular, you will have to get your Client ID, your Client Secret, as well as your username to request the token that allows you to retrieve music data.

These first two codes can be found in your App dashboard (see image above), whereas your username can be found in the last characters of your profile link (see image below for an easy way to get the link).

Fortunately, Spotipy provides a utility method util.prompt_for_user_token that simplifies the process, allowing you to pass your app credentials directly into the methods as arguments:

Two additional arguments are the scope and the redicrect_uri. The former is required by Spotify to make sure for users using third-party apps that only the information they choose to share will be shared. You can learn more about scopes here. The former equals to a valid URL where Spotify authentication server will redirect to after successful login.

3. Extracting music data

Once you have your token, you are ready to get music data! I highly encourage you to read the Spotipy documentation guide, so you can get familiar with its simple syntaxis. Also, it contains multiple examples of how to request data using different endpoints.

Obviously, before anything, make sure you have installed the Spotipy package. As usual, this can be easily done using pip:

Drawing inspiration from the code used by Ian Annase in his youtube tutorial about Spotipy, I have built a programme that use the Spotify browser capabilities to search for an artist and directly display the entire discography of the artist, including the tracks of every album. At the same time, the music data is stored in three separate pandas dataframes, the first containing information about the artists searched, the second with information about the albums and the third with information about the tracks. I have kept the Spotify ids of the items to make it easier in case the tables need to be joined.

Below you can see the main body of the code:

For example, this is what I get when I search for Rosalía, one of the most prominent artists in the current Latin music scene:

The programme is built in a while-loop format, so you can search for various artists without having to re-run the programme. Once you are done, you just have to break the loop and check the content of your databases. Below you can find the result of my search:

4. Conclusion

This tutorial was intended to help you understand the Spotify API and how to retrieve music data from the Spotify platform using the Python library Spotipy. If you are interested in the details of the code, you can have a look at my Github repository. I hope you find it helpful!

--

--

Javier Canales
Analytics Vidhya

I am Freelance Data Scientist with a background in Law and Political Science. Learning as a way of life. Solving the climate crisis as a meaning of life