Video Summarizer made easy using NLP

4 min readJan 27, 2019

So I am here to share my experience with you guys on how I made a Video summarizer using Natural Language Processing. I used subtitles of videos to summarize them. Awesome right!. I could have used Machine learning to develop it but it had a major disadvantage of the requirement of High-performance devices as the training period can be huge for data sets that contain videos. Even when we train using data sets that contain images, it takes a fair amount of time. Each video is a collection of several frames and each of these frames is actually images and each second of standard video consists of 24 frames. Because of this reason I was a bit hesitant to create the video summarizer using Machine Learning.

Obviously, I can’t explain the whole process of how I made the Video summarizer through this article. It would be difficult for both of us. But what I would like to write is a brief article on how I developed this video summarizer. Just the steps!😉.

Flow chart representation of the whole process

1. Sorting out the challenges to be faced

Some of the challenges that I faced while developing the summarizer are :

Not all the videos come with the subtitles especially the ones on YouTube.
Consider the videos with no sound. Like the security camera footages. How do we summarize them?
How efficient can the summarization of the video be?
Summarizing the video to a particular duration.

Above mentioned are some of the major challenges I faced.

2. Using speech recognition to generate subtitle

One of the major challenge of making video summarizer was to generate the subtitles for the videos which didn’t have any. So for this, I used WIT.AI by Facebook which allowed me to generate efficient subtitles. For more Information regarding how I did speech recognition using python, you may see this article: https://realpython.com/python-speech-recognition/

3. Applying Natural Language processing on the subtitles.

I have a video and it’s subtitle with me. What do I do with it?. I applied Automatic NLP based summarization algorithms on the subtitle to generate the summary. Basically, I converted the subtitles to a text document and then applied the summarization algorithm. There is a python library called sumy which provides the summary for a text document to the no of sentences you specify as argument. There are several summarization algorithms that we can use with the help of this library. But I used only the 5 algorithms namely Latent Semantic Analysis, Luhn’s, Edmundson, Text Rank, and Lex Rank. For more information regarding the library, you can visit here: https://github.com/miso-belica/sumy

4. Fitting the duration which user provides

Using the python library sumy, we can rank each sentence (or subtitles in our case). Each subtitle has a certain duration in the video. So to fit the duration which the user provides I founded the average duration of each subtitle by dividing the Total duration of the video with the No of subtitles. Using this average duration I founded the approximate no of sentences which I need to produce the summarized video. The summarization works in such a way that the topmost ranked subtitles are included in the video. If the total duration of the summarized subtitles is more, then we can reduce the one that is least ranked and vice versa. In this way, we can fit the video to the time which the user provides.

5. Creating the final summarized video

So now I got the subtitles summarized and now I have to generate the summarized video. I used the python module called Moviepy. Moviepy is an awesome module of python which allows you to trim, cut, merge …etc video files. Using the time stamps in the summarized subtitles I can divide the video into several segments and finally merge to create the final summarized video. To know more about how Moviepy works you can read this article by Tejas Bobhate: https://medium.com/@TejasBob/moviepy-is-awesome-part1-f90e91fffbb9

Screenshot taken at the same time for the same video summarized in all 4 methods

The above written 5 steps are just a mild explanation on how I created the video summarizer. It requires a lot of work than you think. The whole code is available in GitHub. The link to the repository is: https://github.com/aswanthkoleri/VideoMash

The video summarizer which I created was my semester project as well. You can see the report of the project here: https://drive.google.com/open?id=1yCiLhFdN2GzZYEHGeu9TneTqm4MkN9PX

About me: I’m a B.tech IT student from Indian Institute of Information Technology, Allahabad. I’m currently in the 3rd year. Developing web apps is one of the things that I love to do. I love open source 😍. You can follow me on twitter. You can follow me on Twitter , Facebook, and GitHub.