How to get your favourite artists’ music data from Spotify using the Python library Spotipy
The name of the library couldn’t have been better chosen
How you ever wondered how many albums The Doors produced? Or wanted to know the exact date when Seargent Peppers Lonely Hearts Club Band, from the Beatles, was released? Or maybe you are just a data-music (or music-data) geek who wants to know how to retrieve music data from the Spotify API.
Whatever the reason, in this article I will make a basic tutorial of how to use Spotipy, the lightweight Python library for the Spotify Web API. Particularly, I will explain how you can create a programme to retrieve data about an artist, their albums and their songs, and put the information in pandas data frames, ready for data analysis!
1. Getting started with the Spotify API
For those who are not familiar with, an API is an acronym for Application Programming Interface, which is a software intermediary that allows two applications to interact with each other.
The Spotify API can be regarded as a ‘bridge’ between a user’s app and the Spotify platform (as shown below). Particularly, the API provides several endpoints that return JSON metadata about music artists, albums, and tracks, directly from the Spotify Data Catalogue. To know more about the API and the different types of endpoints, I highly recommend having a look at the API documentation.
So, in order to extract music data from Spotify, first you have to set up a Spotify Developer Account and create an app. It is a very simple and quick process that you can do here.
2. Authentication setup and token
Once you set up your account and your app, you will have to get your credentials in order to connect with the Spotify platform. In particular, you will have to get your Client ID, your Client Secret, as well as your username to request the token that allows you to retrieve music data.
These first two codes can be found in your App dashboard (see image above), whereas your username can be found in the last characters of your profile link (see image below for an easy way to get the link).
Fortunately, Spotipy provides a utility method util.prompt_for_user_token
that simplifies the process, allowing you to pass your app credentials directly into the methods as arguments:
util.prompt_for_user_token(username,
scope,
client_id='your-spotify-client-id',
client_secret='your-spotify-client secret',
redirect_uri='your-app-redirect-url')
Two additional arguments are the scope and the redicrect_uri. The former is required by Spotify to make sure for users using third-party apps that only the information they choose to share will be shared. You can learn more about scopes here. The former equals to a valid URL where Spotify authentication server will redirect to after successful login.
3. Extracting music data
Once you have your token, you are ready to get music data! I highly encourage you to read the Spotipy documentation guide, so you can get familiar with its simple syntaxis. Also, it contains multiple examples of how to request data using different endpoints.
Obviously, before anything, make sure you have installed the Spotipy package. As usual, this can be easily done using pip:
pip install spotipy
Drawing inspiration from the code used by Ian Annase in his youtube tutorial about Spotipy, I have built a programme that use the Spotify browser capabilities to search for an artist and directly display the entire discography of the artist, including the tracks of every album. At the same time, the music data is stored in three separate pandas dataframes, the first containing information about the artists searched, the second with information about the albums and the third with information about the tracks. I have kept the Spotify ids of the items to make it easier in case the tables need to be joined.
Below you can see the main body of the code:
For example, this is what I get when I search for Rosalía, one of the most prominent artists in the current Latin music scene:
Artist's name: rosaliaROSALÍAALBUM: El Mal Querer
1: MALAMENTE - Cap.1: Augurio
2: QUE NO SALGA LA LUNA - Cap.2: Boda
3: PIENSO EN TU MIRÁ - Cap.3: Celos
4: DE AQUÍ NO SALES - Cap.4: Disputa
5: RENIEGO - Cap.5: Lamento
6: PRESO - Cap.6: Clausura
7: BAGDAD - Cap.7: Liturgia
8: DI MI NOMBRE - Cap.8: Éxtasis
9: NANA - Cap.9: Concepción
10: MALDICIÓN - Cap.10: Cordura
11: A NINGÚN HOMBRE - Cap.11: Poder
ALBUM: Los Ángeles
12: Si Tú Supieras Compañero
13: De Plata
14: Nos Quedamos Solitos
15: Catalina
16: Día 14 De Abril
17: Que Se Muere Que Se Muere
18: Por Mi Puerta No Lo Pasen
19: Te Venero
20: Por Castigarme Tan Fuerte
21: La Hija De Juan Simón
22: El Redentor
23: I See A Darkness
ALBUM: Dolerme
24: Dolerme
ALBUM: Juro Que
25: Juro Que
ALBUM: A Palé
26: A Palé...
The programme is built in a while-loop format, so you can search for various artists without having to re-run the programme. Once you are done, you just have to break the loop and check the content of your databases. Below you can find the result of my search:
4. Conclusion
This tutorial was intended to help you understand the Spotify API and how to retrieve music data from the Spotify platform using the Python library Spotipy. If you are interested in the details of the code, you can have a look at my Github repository. I hope you find it helpful!