HitHub — a hit list predictor #Week 5
Hi! Last week, we tried different models based on the dataset. As a result, although the development set results are well, with our custom dataset we didn’t get good results. This week, we haven’t done anything special but discussed why we have got bad results with our custom dataset.
First, we want to introduce what is Billboard Hot 100 list: Billboard Hot 100 a chart that is based on sales and streaming in the United States. So we can say that the chart is not a global chart. Also, there are Billboard Japan Hot 100 and Billboard Canada Hot 100 lists. But in our training and development dataset, the target is 1 if it hits Billboard Hot 100 any period and 0 otherwise, but there may be a song that hits on Canada Hot 100 but not on Hot 100. So, we can conclude that the measurement of popularity is changeable based on the region. The popularity of a song in Turkey does not guarantee that if the same song will be released in the United States and enter the Billboard Hot 100 list. In this case, our models test whether a song released in Turkey will be popular enough to enter the Billboard Hot 100 list if it is released in the United States. In this context, it becomes pointless to examine the accuracies of the custom dataset we have, based on the results of the models. Therefore, we concluded that we need to separate our datasets on a regional basis.
In the coming weeks, we will make our plans in line with this vision and try to get the correct accuracy.
See you next week! 🎧🎧🎧