Latest Way For Scrapping Youtube Comment Using Python

Handhika Yanuar Pratama
Data Folks Indonesia
4 min readOct 26, 2021
Photo by Firmbee.com on Unsplash

“Data! Data! Data! he cried impatiently. I can’t make bricks without clay.”
~ Arthur Conan Doyle

Data is essential today. Every day single person can create hundred or thousand of data without realizing it. From hard workers until the slacker made data on a daily basis. From big industries such as Microsoft, Google, and Apple until small ones that we don’t know, create data to improve their product.

In a machine learning environment, collecting data is the toughest step that every data scientist face. Why? Because significant data will give a great model and also the bad one makes the model suck. There are many ways to collect data. We can gather it manually by coming into the site, reading reports, observation, web scrapping, etc.

Web scraping is the process of gathering data from a site and saving it locally. Every site has its term for scraping. This tutorial will tell you how to scrape comments from youtube into your local drive. Using this method, you will get comments, likes, time, users, and users links.

Other articles have been told about the same topic, such as here and here. But I had tested it, and both of them do not work anymore. Okay, without much introduction, let’s jump into code.

Preparation

Actually, Python has a library built to do this job. It’s named youtube-comment-scraper-python 1.0.0 . This library was released April 12, 2021, by DataKund, and used for fetching youtube comments using browser automation. It works only in Windows. You can read the complete documentation here. To install the module, just install it by this command.

pip install youtube-comment-scraper-python

The module depends on two modules requests and bot_studio, both of them will be installed during the installation from the script above. Bot Studio is used as browser automation. This will prompt up browsers and do the job of scraping the comments of the video.

Coding

The modules are well documented, it shows you how to collect the data, I want to improve it a little bit so that the data will be saved as CSV and you can use it for doing your research as a machine learning engineer. Here is the code that I created

From the script above, the script will first prompt you questions about the youtube links and the output names. For example, I try to scrape this news link. So, I put the links into the prompted question and filled the output name like this.

After a while, a new page browser will pop up and do a scraping job, don’t panic. It will control your screen for just a while and do the scraping job.

By default, it will do scrolling only one time, so check if your scraping is already successful or not.

If you get output like the picture above, it means you’re scraping is working successfully.

If you want to get more data, you can do looping in your code in part of response and data like this

all_data = []
for i in range(0, 10): # It will scroll 10 times
response = youtube.video_comments()
data = response['body']
all_data.extend(data)
df = pd.DataFrame(data)
df.to_csv(saved)

Conclusion

Finally, we come into this section. I hope you understand how to do scraping in youtube comments and work such as built sentiment analysis or others. Just do your own experience and don’t forget to have a nice code. ✌

Are you Indonesian? And looking for a data scientist group? Let’s become a member of Data Folks Indonesia by joining today.

--

--

Handhika Yanuar Pratama
Data Folks Indonesia

Live the Way the Life ask for || A Stoic Engineer || Technical Writer || Runner x Dreamer