How to Build a Music Preference Prediction Model from Machine Learning

4 min readOct 10, 2021

From this article, you will learn how to build a music recommendation model when someone enters their age and gender as input. You can use this model for online music selling websites to increase sales. When a user creates a profile we can recommend the music for that person through our model. Our model will learn patterns from the sample data set and then we can ask our model to make predictions. When building a machine learning model there are 5 steps that we need to follow.

Import Data
Clean the Data
Build the Model
Make Predictions
Check the Accuracy of the Model

Import Data

First, let’s import our dataset to the Jupyter notebook. I’m going to import a CSV file that contains the data about the music preference of the people based on age and gender (please note that this is some random made-up data, not real data)

You can write the following code in python to import your dataset into the Jupyter notebook. (first, upload your dataset into the Jupyter notebook folder which your working notebook is located)

After loading the data you can check the first few rows of the data set by using the following code.

In the gender column “Male” is represented by 1 and “Female” is represented by 0.

Also, you can get a summary about your dataset using the following code.

Clean the Data

In the general process, we need to clean our dataset before we move on to the next step but this dataset does not contain any errors but we need to split our dataset as input dataset and output dataset.

In our case, the input dataset will be age and gender. The output dataset will be the genre.

We can split the dataset using the following code

Build the Model

Now let’s build the model. To build the model we can use the “Scikit” library which is the most popular library in the machine learning. From the scikit library we’re going to use “decision tree” algorithm.

We can create a new instance of the decision tree class from the following code.

Now we need to fit our input and output dataset into the decision tree model. We can do that using the following code.

Make Predictions

Now we can ask our model to make predictions. let’s see what is the recommended music genre for a 22-year-old female. We can give our input data using the following code and see the results.

You can see we got the result as the “Dance”.

Check the Accuracy of the Model

The general rule of thumb is to allocate 80% of our dataset to train and 20% to test. We can use the sklearn train_test_split function to divide our dataset into two sets.