Exploring Per-Shot Transcoding For Video Streaming

Fredrik Lundkvist
The SVT Tech Blog
Published in
4 min readMay 30, 2022

Today’s blogpost is little bit different. Instead of writing about an interesting topic ourselves, we would like to use our platform in order to showcase one of the many university collaborations that we conducted during the Spring of 2022, with this specific project being conducted by students from Uppsala University, and in so doing let the students in question write about this particular project.

We at SVT had the great joy of acting in a supporting role during the project, providing test material, specific advice and scientific guidance. However it should be stressed that all of the hard work was done by the students themselves, designing and implementing the solution.

Now without further rambling on my part, I am happy to hand it over to the project group:

Post and Project by: Douglas Gådin, Fanny Hermanson, Anton Marhold, Joel Sikström, and Johan Winman.

Introduction

The goal of this collaboration with Team Video Core at SVT was to explore the feasibility of a relatively new method in video transcoding called per-shot transcoding popularized by Netflix as per-shot encoding. This was done by building a system that divides a video into smaller segments called shots and transcodes them separately in order to optimize bitrate for each shot while maintaining consistent visual quality. By reducing the bitrate of a video, file sizes are reduced which in turn leads to lower distribution costs and smaller carbon footprint.

The result of this project is a web application that takes a mezzanine file, divides it into shots, transcodes the shots using encore, and reassembles the transcoded shots.

The frontend is implemented using the JavaScript framework React. The backend systems, consisting of a web server and a video processor, is built in Python. The video processor uses FFmpeg. To store data about videos processed by the system, MySQL was used.

System architecture

The first step in the process is to separate the audio and video. This is done to prevent synchronization problems that might occur later in the process when everything is put back together.

In order to encode each shot separately, the video is analyzed to find where shot-changes occur and then split the video into each shot. A shot change occurs when two consecutive frames differ by some extent, such as difference in color or motion. An example of shot changes can be seen in the video below

Sequence of shot changes

Encoding each shot is done using SVT’s transcoding system called Encore. In Encore, each shot is transcoded using x264 with the CRF method to ensure the same visual quality. This is different from 2-pass, which SVT uses in production. Using this method, individual shots may be more optimized than others, which reduces the overall bitrate. When all shots have been separately encoded they are merged back into one single video. Lastly the video and audio is muxed together so that the video file is complete.

Results

For testing, a variety of test videos from SVT containing different types of content were used. As we expected, the results vary depending on the contents of the videos, with visually brighter content generally reaching lower bitrates. The bitrate of visually darker content is instead slightly increased. While we do not know for sure why this is, our theory is that the encoder settings used performs slightly worse for dark content with CRF encoding.

Conclusions

In conclusion, developing the system was fun, and the group learned a lot about both video streaming and building a full stack web application. The per-shot transcoding method performed differently for different types of video content. To be able to draw more substantial conclusions regarding how different content is affected by this method, more videos with different contents should be evaluated. If the system can consistently reach an average bitrate reduction of 2.5% while maintaining visual quality, adding per-shot transcoding into SVT’s video pipelines could bring significant improvements.

--

--