[Week #3 — Guess The Genre]

Mark Defined
BBM406 Spring 2021 Projects
3 min readMay 2, 2021

We’re Atalay Gürel, Muhammed Fidan & Ozan Postacı. We are students from Hacettepe University Computer Engineering Department and this is our third article of our Machine Learning Course Project about Music Genre Classification.

Two features that we use

Chroma Features

A chroma vector is typically a 12-element feature vector indicating how much energy of each pitch class, {C, C#, D, D#, E, …, B}, is present in the signal.

MFCC

Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC. They are derived from a type of cepstral representation of the audio clip (a nonlinear “spectrum-of-a-spectrum”). The difference between the cepstrum and the Mel-frequency cepstrum is that in the MFC, the frequency bands are equally spaced on the Mel scale, which approximates the human auditory system’s response more closely than the linearly-spaced frequency bands used in the normal cepstrum

MFCC is a very compressible representation, often using just 20 or 13 coefficients instead of 32–64 bands in the Mel spectrogram. The MFCC is a bit more decor-related, which can be beneficial with linear models like Gaussian Mixture Models. With lots of data and strong classifiers like Convolutional Neural Networks, Mel-spectrogram can often perform better

Algorithms that we use

KNN

K-Nearest Neighbors (KNN) is one of the simplest algorithms used in Machine Learning for regression and classification problems. KNN algorithms use data and classify new data points based on similarity measures (e.g. distance function). Classification is done by a majority vote to its neighbors.

We tried to train and test accuracy with different k-values. Test accuracy for KNN is generally between %35–40.

Logistic Regression

Logistic regression is a statistical model that in its basic form uses a logistic function to model a binary dependent variable, although many more complex extensions exist. In regression analysis, logistic regression (or logit regression) is estimating the parameters of a logistic model (a form of binary regression)

Our logistic regression result is here. (%42 accuracy on test data)

Conclusion

We can reach test accuracies by using 10 more different features as well. But we choose to use MFCC more. Next week we’ll try to improve accuracy. Meanwhile, we’ll also try different algorithms such as Decision Trees, SVM, etc.

Stay tuned with your music and let us guess the genre

--

--