How to Leverage Spotify API + Genius Lyrics for Data Science Tasks in Python

Maaz Khan
Maaz Khan
Jan 21 · 5 min read

Spotify has burgeoned into becoming the most popular music streaming platform in the world—passing the likes of: Apple Music, Pandora, Tidal, etc. The catalyst for the uptick in popularity stems from the robust catalog of Artists coalesced with its friendly UI and UX; along with the platform pushing Podcasts in 2021, only the sky seems to be the limit for what Spotify can achieve.

Due to its success, Spotify created an API for Python to allow users to access certain metadata from the platform or to create applications using Spotify’s infrastructure. The data derived from the API includes information about songs, albums, playlists, etc. Akin to “Spotify Wrapped”, an analysis can be performed on users’ top artists, songs, albums, and genres.

This tutorial will focus on three objectives: connecting to the Spotify API, extrapolating album data from the API into a pandas data frame, and attaching song lyrics to the data frame from Genius via web scraping.

We’ll be using the latest version of Python (3.8.7).

Step 0: Install Libraries

Below are the libraries that will be needed to execute this tutorial. If you do not have these dependencies already installed, try: “conda install package_name” if you are using a Jupyter Notebook or “pip3 install package_name” otherwise.

Step 1: Connecting To Spotify API

All information regarding the Spotify API (Spotipy) can be found in the docs here. This tutorial will highlight some of the robust features Spotipy offers, such as capturing all the songs within an album as well as their unique features.

First, we need to register for a Spotify developer account which can be created for free here. After registering, you will have access to unique tokens which will allow for a seamless connection to Spotipy. Do not share these tokens with others as they are meant to be kept private. You will need a client id and secret key. Below is the code necessary to establish connection with Spotipy.

Step 2: Extrapolating Data From Spotify API

Once we have established connection to Spotipy, we need to create a function that will take all the songs from any given album and insert the relevant information into a pandas data frame. We just need to capture the album URI which can be found by clicking on the three dots in Spotify.

Here is where an album URI is located on Spotify’s desktop application.

Below is the output of the function above. Notice how we now have the URI for each track within the album along with the track name, duration (milliseconds), explicit (boolean), and track number.

Running above function on ‘Blonde’ by Frank Ocean.

Next, we need to create a function that will take a data frame of all the songs in the album we want to perform our analysis on (this is the output of the above function) and attach features such as danceability, energy, key, and loudness per track. This can be performed by taking the URI of each song.

Output of running the get_track_info function.

Next, we need to merge the data frames together. This can easily be done with the function below. This method can also be performed manually without the use of a custom function.

Step 3: Attaching Song Lyrics From Genius.com

Lastly, we will be web scrapping Genius to attach the lyrics of the songs to our data frame. We will be utilizing the beautiful soup library to achieve our goal. Because this is a tutorial on the Spotify API more so than web scrapping, I have attached a video tutorial here that goes more in depth with the beautiful soup library. It is a robust topic that requires a tutorial on its own… Luckily, we only need to utilize a few lines to capture our song lyrics.

Function one (scrape_lyrics) is executed in function two (lyrics_onto_frame)… So as long as you have run the code, everything should work by only using the second function.

Note: Song titles and artists names with special characters (+,-, *, ~, etc.) will not be properly scrapped.

Summary

Here is a summary of all the functions we defined above in a working example. Not as daunting as it looks… To recap: we ran a function to get all the songs from our desired album via the Spotify URI into a data frame. Then, we repeated the process but, instead, created a function to attach metadata of all the songs in our desired album to our data frame. Lastly, we created a function to scrape the lyrics of all the songs in our album and attached it to our pandas data frame.

Step 4: Song Popularity (Extra Credit)

If you want the popularity of the songs, there is an extra function I have provided to do just that. Unfortunately, there is no other way to do this other than this method but if you do run into an alternative solution… please let me know!

Analysis Ideas

An example of analysis that can be performed given the data set we created.

This dataset can be versatile in terms of analysis. The first few ideas that come to mind have to do with natural language processing (NLP) on the song lyrics as well as a machine learning model to predict popularity. Another possibility is comparing the audio features from different albums from the same artists. In our case… we can compare Frank Ocean’s first studio album, Channel Orange, and his second studio album, Blonde. Let me know if you can think of other creative ways to extract insights from this data set, I’d love to hear all about it!

The Startup

Get smarter at building your thing. Join The Startup’s +787K followers.

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +787K followers.

Maaz Khan

Written by

Maaz Khan

Data Science

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +787K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store