API’ s are one cool thing I learned in my data science bootcamp. It stands for Application Programming Interface and can be used to retrieve or manipulate data from the web inside a python environment. Once you request this data, it comes in a JSON format, Java Script Object Notation, and you can access any part of the webpage in your program since it is formatted like a python dictionary! API’s are the perfect tools to get unstructured data from the web and a bunch of popular websites already have an API like twitter or reddit.
I wanted to write a python script to access songs lyrics by just typing the name of a song and the name of an artist. I found a very cool API from Musixmatch, an Italian music data company which has a database of 14 millions lyrics in many languages! Let’s check it out:
First, I had to import the requests library to submit the http request in Python. I have to mention that I had to register to Musixmatch in order to obtain a key. You need a key to use their API and some of the functions are not available for free. Luckily for me, I had access to the get_lyrics function of the API.
The first thing I did there is to assign the API link to a variable “url” and did the same with the key I received, I.e. “key”. The next step was to create a request variable containing the actual request: get(url + parameters). Each API has its own set of parameters and these parameters are defined to get a specific part of the data or to perform a specific action. To get the lyrics as I intended, I only needed to set my key as a parameter first and then ‘“q_track”, which is the name of the song, and “q_artist”, the name of the artist. I used Prince as an exemple.
Then the next step was to request the JSON of the Prince-Soft and Wet webpage containing the lyrics with the line ‘req.json’:
Intimidating ? Not so much… The first good news is that I have data! The second one is the JSON format. Inside python, it is just nested dictionaries, so I was able to access the part I was interested in; “lyrics_body” by indexing the dictionary:
Yes! These are the correct lyrics shown there. I had to do some cleaning with strip() and replace() but these are the lyrics. Partial lyrics…. I discovered that the free options using the Mixmatch API were limited. Now that I discovered how to access the lyrics of a Prince song, I decided to write a function with input so that I could just call this function that would prompt me to enter a song name and the artist name and return the lyrics, fun stuff:
I basically repeated all the steps from above inside a function, setting the url and key variable. I created the request variable and a return for the clean JSON data. The only differences are the parameters values where I first created two variables with input and put these two variables: song and artist as the values for the parameters keys “q_track” and “q_artist”. Now If I call this function; I get asked to enter a song and an artist then, those values are passed in the parameters so it directly returns the correct lyrics! Let’s try this:
There we go !! And the lyrics are in Jamaican patois !! The only thing left to do is to test this function with more songs!
I think API’s are great, as I am learning more about data science, I realize that getting the data and cleaning it is a huge part of the job. Still it really, amazes me to be able to pull data from anywhere on the web using web scrapping and API’s. It gives you the freedom to tackle any problems and I love it. Now I have one question for you: Which songs were used in this article ??