TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Analyzing Music Video Trends on Youtube Using Python

6 min readNov 15, 2020

--

Photo by Jakob Owens on Unsplash

Music videos have a huge role in the music industry. Not only for promotional purposes, but music videos are also created to represent the image of the artists and visualize the interpretation of the songs.

In this digital era, Youtube has become the most dominant streaming platform worldwide. There’s a lot of music videos with various genres, exist on Youtube. It can’t be denied, one of the ways to see the popularity in the music industry is by looking at the music video trends on Youtube.

Are you ever curious about what is the most searched music video on Youtube? Because I have been. That’s why I tried to figure out the search trends of Youtube, by using Pytrends. Pytrends is an API for Google Trends, which allows you to retrieve the trending on Google search engines, including Youtube. In this tutorial, I would like to show you how to get insights into the Youtube search trending in Python.

Project Set-up

The first thing you need to do is install the Pytrends and Folium package via pip.

pip install pytrends
pip install folium

After installing the API, open your Jupyter notebook and then import the necessary library, including:

  1. Pandas to handle the Dataframe,
  2. Seaborn and Matplotlib to create the charts,
  3. Folium to create the map visualization.
from pytrends.request import TrendReq
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import folium

The next steps are connecting to Google and determining what kind of data that we want to get from Pytrends.

pytrends = TrendReq(hl='en-US', tz=360)kw_list = ["music video", "mv"]
pytrends.build_payload(kw_list, cat=0, timeframe='today 5-y', geo='', gprop='youtube')

In this case, I used the “Music Video” and “MV” as the main keywords. Also, I used it as a timeframe to retrieve the data from the last 5 years. To see the option of parameters, you can see the documentation of Pytrends here.

Related Queries

To see the terms that most people are interested in — related to Music Video, we can use Related Queries. In Pytrends, it is a function to find the related terms to your keywords, that were searched by other users.

df_queries = pytrends.related_queries()

There are two kinds of metric results, Top and Rising.

Top — The most popular searched queries, ranked by Google Trends score value (0–100)

Rising — The queries with the biggest increase in search frequency for a timeframe. If you want to see the recent hit queries, you can see this metric.

I combined the Top results of Music Video or MV, and select the 5 highest values with the code below.

top_music_video = df_queries.get("music video").get("top")
top_mv = df_queries.get("mv").get("top")
df_top = pd.concat([top_music_video, top_mv] )
df_top.sort_values(['value'], ascending = False).head(5).reset_index(drop = True)
Top Queries of Music Video

Also, I select the top 5 of the Rising results with the code below.

rising_music_video = df_queries.get("music video").get("rising")
rising_mv = df_queries.get("mv").get("rising")
df_rising = pd.concat([rising_music_video, rising_mv] )
df_rising.sort_values(['value'], ascending = False).head(5)
Rising Queries of Music Video

Analysis

From the Top result, we could see that the trends of Music Video for the last 5 years are dominated by the K-pop industry.

Photo by Joel Muniz on Unsplash

Blackpink (a K-pop girl band) and BTS ( a K-pop boy band) are the artists whose music videos are the most searched on Youtube. From the Rising result, we found that there is one of BTS’s music video called Dynamite that became a huge hit and searched by a lot of internet users.

There is also Gacha Life, which is a game to promote creativity and storytelling in kids. Gacha Life’s users like to create music videos based on Gacha animation, and many Youtube users are interested in that.

Interest Over Time

From the result of related queries, we found 4 main terms that are related to Music Video or MV, they are BTS, Blackpink, Gacha Life, and Dynamite.

To get the growth of those terms’ popularity on Youtube, we can use Interest Over Time.

kw_list = ["Gacha Life", "Dynamite", "Blackpink", "BTS"]pytrends.build_payload(kw_list, cat=0, timeframe='today 5-y', geo='', gprop='youtube')df_interest = pytrends.interest_over_time().drop(columns='isPartial')

After getting the data in the form of DataFrame, we can visualize it by using Seaborn and Matplotlib as shown below.

plt.figure(figsize=(16, 8))
plt.xticks(rotation=45)
sns.set(style="darkgrid", palette = 'rocket')
ax = sns.lineplot(data=df_interest)
ax.set_title('Music Video Trends Over Time', fontsize=20)

Analysis

Between those four terms, “BTS” has been popular since 2016. The popularity kept going until now, followed by “Blackpink” in the second place. The “Dynamite” itself started booming in the middle of 2020. Meanwhile, the “Gacha Life” started its popularity in late 2018.

Interest by Region

From the result above, I personally became more interested in BTS, because their music video dominates the popularity among others. So in this case, I wanted to know the spread of BTS audiences on Youtube worldwide.

kw_list = ["BTS"]pytrends.build_payload(kw_list, cat=0, timeframe='today 5-y', geo='', gprop='youtube')df_interest_region = pytrends.interest_by_region(resolution='COUNTRY', inc_low_vol=False)df = df_interest_region.sort_values(by='BTS', ascending=False).head(10).reset_index()

I tried to create a chart of 10 countries which most interested in BTS Music Video.

fig = plt.figure(figsize=(15,5))
plt.bar( df['geoName'],
df['BTS'],
color = ['#fa9d96','#fa928a','#ff828a','#ff728a','#ff628a','#ff528a','#ff428a',
'#ff328a','#ff228a','#ff028a'])
plt.xticks(rotation=45,ha='right')
plt.title('Countries Which Most Interested in BTS Music Video',y=1.1,fontsize=20)
plt.xlabel('Country')
plt.ylabel('Value')
Bar Chart

If you are interested to see the map visualization, you can try to create the Choropleth map provided by Folium. Choropleth maps present the interval data as shade colors, where the lighter shades represent the lower numbers, and the darker shades represent the higher numbers. Try the code below.

url = 'https://raw.githubusercontent.com/python-visualization/folium/master/examples/data'
country_shapes = f'{url}/world-countries.json'
the_map = folium.Map(tiles="cartodbpositron")
the_map.choropleth(
geo_data=country_shapes,
name='choropleth',
data=df_interest_region,
columns=['geoName', 'BTS'],
key_on='properties.name',
fill_color='Purples',
nan_fill_color='white',
fill_opacity=0.8,
line_opacity=0.5,
)
the_map

Then, let’s look at the result.

Choropleth Map

Analysis

As we can see from the bar chart above, the largest audience of BTS is from Brunei, followed by another Asian country such as the Philippines, Mongolia, Malaysia, and Myanmar. It is supported by the map which showed that the darker shades appear in Asia. The other continents like America and Australia are also being a part of BTS’s audiences, even though not as bold as Asian countries.

Conclusion

As one of the most popular video search engine platforms, Youtube can provide us an insight into what kind of music video that people around the world most curious about. Pytrends makes it possible to play around with those data and even create visualizations of them by using Seaborn, Folium, and Matplotlib. Because of those tools, now we know that for the past 5 years, K-pop has a great dominance in the worldwide music industry. Will this trend keep going? We’ll see.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Tazki Anida Asrul
Tazki Anida Asrul