How to skip Youtube ads using python?

Published in

Analytics Vidhya

4 min readFeb 2, 2020

Youtube videos has become part of our everyday life. It helps us watch the content which we are interested in unlike the television.

I also use youtube extensively for entertainment and educational purposes.

As advertisement is a major source of revenue for youtube.com, therefore, they are using it like never before.

One of the problem that I encountered was that, I have a habit of watching television while I am eating my meal.Now on youtube, while watching a show/video I had to always click on the skip ad button whenever an advertisement came.

Also, I use to meditate while listening to the teachings of a spiritual guru. My meditation got disturbed every time an advertisement came.

Therefore I decided to create a machine learning model that can recognize if there is an advertisement on the screen and clicks on the skip ad button.

I started collecting the data set by using Windows snipping tool. As neural networks require large amount of data to train, I collected around 600 images. Almost 300 images contained skip ad button and another 300 images were of other class label.

I realized that it would be very difficult to collect images using the manual way. I knew that I can record the screen using any software in the market. Therefore, I searched on the internet for separating the frames of a video so that I can get each frame as an individual image.

OpenCV had the option to do that. So, I learned about that and separated each frame. The data collection part became very fast. Now I had around 1800 images to train a model.

I realized that images of big size are difficult to train even on Google Colaboratory. I reduced the image size, even then it was taking to much space that the Google Colab notebook crashed every time I trained a model.

Feature Engineering

I tried another technique wherein I selected only a part of the image where the skip button was present. I used numpy index slicing to select only a part of the image.

As expected, the training went smoothly and the model is trained with around 99 percent accuracy.

I used pyautogui library to click at a particular point on the screen.

Soon I realized that there is a problem, different screens can have different resolutions and the skip button can be at different places depending on the zoom or full screen mode or theater view.

Change in strategy

While I was learning about OpenCV, I came to know about SIFT features and Template Matching feature in OpenCV.

I found Template matching in OpenCV to be the exact solution to the problem I was trying to solve.

I created various templates depending on the various resolutions and zoom levels of the screen.I tested the script thoroughly on two different laptops and at various resolutions.

The application is working fine on cross platforms.

Surprisingly, it has shown to work in another video streaming application known as voot.com, where some Indian television shows are streamed.

Various Libraries used

import cv2
import numpy as np
import pyautogui
import time

Numpy is used to convert the python image object into numpy array so as to use it in the template matching method of OpenCV.

Reading the templates

# reading the templates
template3 = cv2.imread('template3.png', 0)
template4 = cv2.imread('template4.png', 0)
template5 = cv2.imread('template5.png', 0)
template6 = cv2.imread('template6.png', 0)

The templates are read in grayscale format as indicated by flag ‘0’. I have created four templates, based on different zoom levels after thorough testing.

These templates covers all the cases(resolution, zoom and full screen).

Thresholding

# setting the threshold for confidence in template matching
threshold = 0.7

I have set the threshold to 0.7 so that when the probability of a region in the image matching to the template is quite high, only then consider that point.

Stopping criteria

# alert box for stopping criteria
pyautogui.alert(text = 'Keep the mouse pointer on the top left/ corner of screen to stop the program', title= 'Stopping Criteria')

At the beginning of the program, an alert box pops up with the message of how to stop the script.

I have not used any GUI to start and stop the script. Rather I have used a hack wherein I check for a particular condition to break out of the program.

Continuous Loop

While loop to check for template matching and clicking on the skip button

This loop will check for a template one by one and as soon as it finds a match it will click on it based on the pixel location where it is found.

#     Stopping criteria    
if pyautogui.position() == (0,0):
     pyautogui.alert(text = 'Adskipper is Closed', title =/    'Adskipper Closed')
     break

After checking with all the templates, it checks whether the position of the mouse is at (0,0), if it is the case, then it breaks out of the loop after showing an alert box with the message that the program is closed.

Further improvements

We can use deep learning based image segmentation techniques like U-NET, SegNET, etc. These are pixel level image segmentation techniques. In this way, we do not have to create different templates and there is no chance of left out of any corner case, as the approach is totally different.

We can scale it to cell phones as well.

We can also create a GUI for the user so that he/she can turn it on or off easily.

Whole of the source code is available at my Github profile: https://github.com/1993jayant/youtube_adskipper