Getting Started with Spotify’s API & Spotipy

A data scientist’s quick start guide to navigating Spotify’s Web API and accessing data using the Spotipy Python library.

Max Tingle
6 min readOct 3, 2019

In case you are not familiar with APIs, an Application Programming Interface is basically a server that you can access to get data from and send data to. In the case of Spotify, the company provides software and app developers access to some of their data about users, playlists, and artists through a Web API.

Spotify’s Web API is optimized to support app integrations, but it can be used for data analysis purposes, and this article is going to walk through the basics of:

  1. Setting up your Spotify Developer account,
  2. Exploring Spotify Web API resources, and
  3. Using Spotipy, a Python library for Spotify API.

Setting Up Your Spotify Developer Account

Step 1: Log In to Spotify Developer

Connect Spotify Developer to your Spotify account by logging in or creating a free Spotify account here.

Step 2: Create a Client ID

Once in your dashboard, click the green “Create a Client ID” button to fill out the form to create an app or hardware integration.

If you are a data scientist, you are probably not trying to create an app. However, to get a Client ID and access data, you have to fill out this form.

Step 3: Retrieve Client ID and Client Secret

On your developer dashboard page, click on the new app you just created, and on the app’s dashboard page you will find your Client ID just under the header name of your app. Click “Show Client Secret” to access your secondary Client ID. While on this page, if you scroll down, you will see stats about your app including the number of requests you make each day.

Note: If you are interested in requesting Spotify user data (profile, playlists, etc.), you have to register your application. For the purposes of this article, I will be exploring Spotify Web API endpoints that are open and return data without requiring registration.

Exploring Spotify Web API Resources

After setting up your Spotify Developer account, I recommend browsing a few areas of the website to familiarize yourself with:

  1. What you need to know in order to use the API,
  2. The data endpoints you have access to, and
  3. Testing data endpoints to preview JSON output.

Explore: Spotify’s Web API Documentation

Spotify for Developers provides detailed and user friendly documentation for their Web API. After clicking the “DOCS” tab in the header of the website, click the Web API icon to access:

  1. Web API overview documentation,
  2. A more in depth quick start tutorial,
  3. Web API guides, and
  4. Links to wrapper libraries for various languages.

Note: To get started quickly, I recommend reviewing the first link to the Web API overview documentation and returning to the others should you be using a language other than Python and/or looking to access Spotify user data.

Explore: Spotify’s Web API Reference

At the time of writing this article, Spotify was rolling out the beta version of their new Web API Reference. This tool is particularly useful in exploring the data endpoints you have access to as a developer and accessing related documentation.

I have put together a table of data endpoints that are NOT user related to give you an idea of the data you can access without registering your app and without having to collect user information/permissions.

Explore: Spotify’s Web API Console

After taking a moment to explore available data and methods in the Spotify Web API Reference (Beta), I recommend trying out the Spotify Web API Console to test various methods and preview the JSON output.

If you are still logged into your developer account, you can easily request tokens using the green “Get Token” button and even automatically fill the query with sample search parameters. You can then preview the JSON output on the right side of the screen.

Using Spotipy, a Python Library for Spotify API

Now that you are familiar with Spotify Web API, it is time to retrieve data. Again, I have approached this article from a data scientist perspective, which is why I am using Spotipy, a lightweight Python library for Spotify Web API.

Spotipy has pretty succinct documentation that I highly recommend reading before using the library. I will walk you through setting up Spotipy and using one of its built-in methods. I learned the following steps from this GitHub repository and found them to be pretty easy to implement.

Step 1: Install Spotipy

Run the code below in your terminal or a Jupyter Notebook to install Spotipy.

pip install spotipy

Step 2: Import and Set Up Spotipy

You will need to copy your Client ID and Secret ID that we set up in earlier in this article into the code below to start querying the API’s endpoints.

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
cid = 'Your Client ID'
secret = 'Your Secret ID'
client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(client_credentials_manager
=
client_credentials_manager)

Step 3: Retrieve Data from Spotify Web API

Spotify methods to access Artist, Track, and Album data endpoints all require you to know their individual Spotify ID. I started with the search endpoint, exemplified below, because it does not require a Spotify ID.

The following code collects 1,000 Track IDs and their associated track name, artist name, and popularity score.

artist_name = []
track_name = []
popularity = []
track_id = []
for i in range(0,10000,50):
track_results = sp.search(q='year:2018', type='track', limit=50,offset=i)
for i, t in enumerate(track_results['tracks']['items']):
artist_name.append(t['artists'][0]['name'])
track_name.append(t['name'])
track_id.append(t['id'])
popularity.append(t['popularity'])

Note: It is important to note that Spotify has set the maximum offset to 10,000. In the example above, sp.search() returns a maximum of 50 results per query, which is why a nested for loop is utilized.

Step 4: Load Data into DataFrame for Exploratory Data Analysis

import pandas as pdtrack_dataframe = pd.DataFrame({'artist_name' : artist_name, 'track_name' : track_name, 'track_id' : track_id, 'popularity' : popularity})print(track_dataframe.shape)
track_dataframe.head()

The code above uses Pandas to create a DataFrame from the lists created in the previous step. This step only brings the data into a DataFrame. Next comes the exploration, cleaning, and analysis processes.

I am not going to get into those processes in this article, but I hope these steps were able to illustrate the ease of getting data from the Spotify Web API using Spotipy. I also hope that in combination with the table of data endpoints and methods I provided above, you can start to imagine the scope of the data that can be collected from Spotify and insights that could follow when employing more advanced data science.

After some research, I did find a Kaggle dataset of ~80,000 Spotify artists and their associated Spotify IDs that I am starting to use to retrieve Artist, Track, and Album data. More to come in a future article! But for now I hope this was a short, clear, and useful introduction to getting started with Spotify API and Spotipy.

--

--

Max Tingle

Data Engineering Specialist at DC Public Charter School Board in Washington, DC.