SONGS RECOMMENDATION SYSTEM

Rahul Araveti
Web Mining [IS688, Spring 2021]
5 min readMay 7, 2021

Isn’t it fascinating how the music service providers like Apple music and Spotify gives us a list of recommendations just with a few songs we search on those applications? I was keen on figuring out how these applications work and hence went ahead in selecting this topic.

When it comes to songs there are millions of songs to listen, and a lifetime is not enough to listen to all the songs available out there. It is impossible for someone to go through the entire list of songs to figure out and filter the songs of their liking. There is a very high chance of missing out on listening to many songs that could be in your favorites list. The aim of our project is overcome that problem. Here, we have established a recommendation system in this project to recommend songs based on your interests.

For any music service provider to be successful, they should be in a position to make the most appropriate recommendation with respect to their customers individual taste and liking. In order to build this system, machine learning algorithms have to be used to gather data (which are nothing but list of various songs) from multiple locations and perform analysis to provide the customer with a list of songs that they might like.

Recommendation system can be categorized into three types:

1. Content based

2. Collaborative

3. Popularity

Content based recommendation system works on the historical data to provide recommendation meaning, it uses the past liking of the customer to provide him or her with the new list of songs similar to those that he or she has already listened to.

Collaborative based recommendation system uses the data gathered from other similar customers to provide the targeted person with the recommended list.

Popularity based recommendation system is one of the easiest models to work with and it is based on the concept of trend. It recommends the customer with songs that are popular among the users. It has a drawback though it is the easiest to implement. It is that this system cannot provide the customer with a personalized list even if the history and behavioral patterns of that particular individual is known.

Source of the data set

The data set is downloaded directly from the following website: http://millionsongdataset.com.

The data set contains million songs and the artists, so we chose a subset which contains only 10,000 songs.

song_df.head()

Various methods used for recommending songs:

I have used three different categories to suggest a list of most relevant songs to the listener based on his or her interest and liking. They are based on popularity, artist and song. The following explanations gives us an understanding of the different ways.

Recommendation based on popularity

This method recommends the listener a list of top ranked songs based on popularity and relevance. I first split the dataset into train and test in the 80:20 ratio and train the recommender system model using train data set. The next step would be to get the number of people who listened to a particular song and call it the recommendation score. Then I sorted the list in descending order based on the Recommendation score. Later, a rank is generated based on these scores. The last step would be to print the top 10 songs based on the rank generated. This output is shown in the following image after code block.

Recommendation based on Artist

This method recommends the listener a list of top ranked songs based on relevance to artist. Similar to the first way, we split the dataset into train and test in the 80:20 ratio. I first get the list of all unique songs for a particular user. The next step would be to get the list of all unique songs present in the training data set. Then a item co-occurence matrix is constructed from the information gathered. Later, this constructed matrix is used to make the recommendations based on artist by using the recommendation scores and ranks. The last step would be to print the top 10 songs based on the rank generated. This output is shown in the following image after code block.

Recommendation based on song

This method recommends the listener a list of top ranked songs based on relevance to popular songs. Similar to the first way, we split the dataset into train and test in the 80:20 ratio. I first get the list of all unique songs in the training data set. Then a item co-occurence matrix is constructed from the information gathered. Later, this constructed matrix is used to make the recommendations based on song by using the recommendation scores and ranks. The last step would be to print the top 10 songs based on the rank generated. This output is shown in the following image after code block.

Softwares Used:

The softwares used are SKlearn for training the dataset. Co occurence matrix is used to identify the similarity. A co-occurrence matrix will have specific entities in rows (ER) and columns (EC). The purpose of this matrix is to present the number of times each ER appears in the same context as each EC. As a consequence, in order to use a co-occurrence matrix, you have to define your entities and the context in which they co-occur.

Conclusion :

In this project we studied and compared the different methods of using recommendation systems. At first, I tried to get the top 10 popular songs and then recommended songs based on the artist. The results were accurate and as expected, and then performed recommendation based on the given song where it shows songs similar to the given song. All the recommendations were ranked from 1–10 and yielded the accurate results.

References:

  1. https://www.pythonprogramming.in/how-to-calculate-a-word-word-co-occurrence-matrix.html

--

--