Week 5 — Tune It Up

Adnan Fidan
BBM406 Spring 2021 Projects
4 min readMay 16, 2021

Hello world,
We are Fidan Samet, Oğuz Bakır and Adnan Fidan. In the scope of the Fundamentals of Machine Learning course project, we are working on music genre transfer and prediction. We will be writing blogs about our progress throughout the project and this is the fifth one of our blog series. The developments in music genre prediction and music genre transfer will be covered in this post. So let’s get started!

Previously on Tune It Up…

Timeline of Tune It Up

In the last week, as we changed the project domain of our topic, we discussed our changed dataset, the consequences of machine learning algorithms on music genre prediction, and the baseline results of music genre transfer. You can find last week’s blog here. This week, we will examine the baseline results of a different algorithm on music genre prediction and CycleGAN model for music genre transfer.

Music Genre Transfer

CycleGAN, which we decided to use in the previous post, is a technique for training unsupervised image-to-image translation models via the GAN architecture. This model has been used in many image-to-image translations and gives spectacular successful results. We try to apply it in the field of music genre. The results of the studies are shown below.

Example of Object Transfiguration from Horses to Zebra and Zebra to Horses.¹

The architecture of CycleGAN consists of two GANs arranged in a cyclical fashion. This network contains two stride-2 convolutions, several residual blocks, and two fractionally strided convolutions with stride 1/2. It is shown in the below figure.

The architecture of CycleGAN

At this stage, we have edited the CycleGAN model input and output formats. With this change we made, CycleGAN can take MIDI format files as input to the model and produce files in the same format as the output. We train this edited model to obtain baseline results. However, it takes time as expected from a deep learning method.

Music Genre Prediction

Until this stage in order to music genre prediction, we tried three different techniques: Naive Bayes, k-Nearest Neighbors, and Random Forest. This week, we added the Multi-layer Perceptron² classifier (MLP) to these techniques. MLP, a supervised learning algorithm, is a classification technique that works with multiple layers; input layer, hidden layer, output layer. In this technique, which is capable of parallel processing using multiple neurons, the increase in accuracy depends on the increasing number of iterations.

Loss of Different Epochs in MLP

After tuning model parameters, we obtained the best accuracy value as 85%. We obtained a higher accuracy value from this model compared to all the techniques we have tried so far. We plan to progress with MLP method.

Best Test Accuracies Obtained with Different Algorithms

That is all for this week. Thank you for reading and we hope to see you next week!

References

[1]Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (pp. 2223–2232).

[2] Multilayer perceptron — Wikipedia. https://en.wikipedia.org/wiki/Multilayer_perceptron

Past Blogs

--

--