5 Followers
·
Follow

Continuing from our Lukthung (music genre) Classification — Part 1, we will go over the audio model in this blog. As in the last post, you can get full details of our work from our paper here. Now let’s get started!

For each song in our dataset, we extracted a 10-second audio spectrogram during the chorus part using the parameters below. We used the python library Librosa to perform the extraction. You can refer to this blog post: Using LibROSA to extract audio features.

  • Sampling rate: 22050 Hz i.e. frame size = 4.53e-5 s
  • Frequency range: 300–8000 Hz
  • Number of Mel bins…

About

KK EZ

interested in machine learning and vision