[Week #6 — Rock or Not? ♫]

☞ This sure does.

Defne Tunçer
bbm406f18
5 min readJan 6, 2019

--

We are Defne Tunçer & Kutay Barçin and this is our sixth article of series of our Machine Learning Course Project about Music Genre Classification.

GitHub

Last week, we talked about how to improve our Support Vector Machine, Linear Regression and, as a bonus examine two methods Stochastic Gradient Descent and Ridge Classifier. This week, we’re planning to dive deeper into feature and model selection!

FEATURE SELECTION

We believe that feature selection carries an importance since we have a feature set of 11 main levels; chroma-cens, chroma-cqt, chroma-stft, mfcc, rmse, spectral-bandwidth, spectral-centroid, spectral-contrast, spectral-rolloff, tonnetz, zcr which makes it a whole rich 518 dimensional set. In order to do that, we used k-fold cross validation, then the situation became that each model we select requires a different feature set.

As for Support Vector Machines, we investigated further into SVM with the Radial Basis Function (RBF) kernel. While testing through all the features, we came to a realization that mfcc, spectral-centroid and spectral-contrast combined together work the best for SVM with RBF kernel. At the end, we reduced our feature dimension from 518 to 196.

As we move on to Logistic Regression, finding the right features was rather less time consuming than SVM approach since LR works faster. In the end, combination of chroma-cens, chroma-cqt, mfcc, spectral-contrast, spectral-rolloff and zcr fit the best for Logistic Regression approaches. Thus for LR, we reduced our feature dimension from 518 to 329.

Although finding the right features cost us too much time, we improved our efficiency with respect to time, and achieved slightly higher accuracy ratings for both models!

MODEL SELECTION

For model selection, having a subset of all features from feature selection was quite useful in minimizing the classification errors. In order to do the right selection, we chose cross validation to go with.

When training Support Vector Machine with RBF kernel, there is two parameters to be considered, C and gamma. The parameter C is common to all SVM kernels which trades off miss-classification of training examples against simplicity of the decision surface. A low C makes the decision surface smooth, while a high C aims at classifying all training examples correctly. Gamma parameter on the other hand defines how much influence a single training example has.Thus, The larger gamma is, the closer other examples must be to be affected.

After applying cross validation, we came to a conclusion that the best parameters we get for our data are C = 1.2 and gamma = 1e-1(0.1). With all these feature and model selections we achieved 82.81% training accuracy and 67.80% test accuracy!

As we mentioned last week, for Logistic Regression we have several parameters to tune with. For the solver A Library for Large Linear Classification (liblinear), we tuned penalty = l2, C = 0.15, and OvR scheme which resulted in 70.84% training accuracy and 65.96% test accuracy which has become the highest accuracy rating we achieved by using Logictic Regression. The same solver with penalty = l1, C= 0.6 with OvR scheme also achieved a smilar score that is 71.00% training accuracy and 65.88% test accuracy.

The solver Newton (newton-cg) gave the best results with penalty = l1, C = 0.12, and OvR scheme that is 70.96% training accuracy and 65.80% test acuracy.

For Stochastic Average Gradient (sag) and (saga), solver = sag, penalty = l2, C = 0.1, and multinominal approach resulted in 70.36% training accuracy and 65.70% test accuracy.

The solver Limited-memory Broyden–Fletcher–Goldfarb–Shanno Algorithm (lbfgs) on the other hand was outperformed by other solvers as it is designed for small datasets. Even with the parameter tuning, it couldn’t achieve beyond the baseline score of Logistic Regression.

To conclude feature and model selection, with the help of scaling, we improved all of our test accuracy ratings by 1–3%, and even decreased some of our training accuracies!

As an addition to this week’s feature and model selection, we also played and altered our dataset! Since we have 15 music genres that even a human ear can’t recognize the difference between some of them, we decided to see how our models work if we simply just remove some of the genres from our dataset! We will also observe the work of our models on more balanced dataset this way.

For the beginning, we decided to remove 8 of our genres from the dataset whose numbers were too low on both training and test samples. This way, we also obtained a better balanced dataset. Thus, we ended up with 88.43% training accuracy and 75.82% test accuracy!

Confusion Matrix of 8 top-genre prediction with SVM RBF kernel

As seen from the confusion matrix above, having a balanced dataset might really help us in the future!

As the results have gotten higher, we decided to alter our dataset one more time and use 4 top-genres of the dataset, in other words, 4 of the genres that are most occurred in both training and test samples. Looking at the results, we achieved 92.65% training accuracy and 88.20% test accuracy! We surely can’t achieve higher with our ears..

Confusion Matrix of 4 top-genre prediction with SVM RBF kernel

From these two motivating examples, we came to a conclusion that music genre classification can actually predict better and beyond the human brain with good algorithms on greater and better datasets.

For the next week, we aim to create a Neural Network, and do further feature and model selection on kNN, SVM with linear kernel and Stochastic Gradient Descent.

Here is a song that we have discovered while getting lost in our dataset and classified correctly — yep, it’s rock!

P.S. As we are approaching the end of our project, a short video presentation of our project is on its way..!

--

--