Feature Teardown: Netflix’s “Skip Intro”

Eugene Leychenko
The Startup
Published in
4 min readJan 3, 2018

Netflix rolled out an amazing feature which allows viewers to skip the intro of their favorite shows. This would save them ~30 seconds, but that adds up during a binge watching session. The interesting question is — “how did they pull this off…at scale?” Let’s dive in to look at a few possible solutions.

Human Tagging

Perhaps there is a sad employee that goes in and watches all the Netflix shows and writes down what time the intro starts and ends. Highly unlikely. As of 2017, Netflix has 110M members and have a little under 7k titles. That would make a really sad intern and would probably result in arthritis.

Machine Learning

Netflix awarded a $1 million prize to a developer team in 2009 for an algorithm that increased the accuracy of the company’s recommendation engine by 10 percent. The Netflix Prize was an open competition for the best collaborative filtering algorithm to predict user ratings for films, based on previous ratings without any other information about the users or films, i.e. without the users or the films being identified except by numbers assigned for the contest.

Because Netflix is an amazing technology firm, they would use the wisdom of crowds to know when the intro starts and ends. They would be able to analyze the 1 billion hours of weekly videos watched to see when people skip over the intro. Based on the trend, Netflix would be able to approximate when the intro begins (where people begin scrubbing) and where they drop the marker is where it ends.

Screen Scraping

Tunity, allows users to hear any TV — even if it’s muted. The magical technology scans the TV that you are watching and matches it with sound of the channel that is being broadcast. The magic is the recognition. Tunity compares the few seconds of video that you transmitted to their servers to all the video of supported channels.

The Office’s Opening Frame.

Another way Netflix could have leveraged ML was to have used computer vision. Let’s take for example the wonderful show, The Office. The Office’s intro last ~30 seconds and has the same opening frame.

If the algorithm always looks for this frame in the show, it can calculate what time the intro begins and then knows to drop the person off at T+30s.

Audio Recognition

Another component of the intro is the music. The music is always the same for the intros. All the algo would have to do is recognize the acoustic fingerprint. This is the way that Shazam works. Below find the House of Cards intro fingerprint.

House of Cards Intro

And here is the Office intro.

The Office Intro

In conclusion, there are several ways for Netflix to recognize the beginning and end of show intros. All the ways mentioned above leverage either the wisdom of crowds and/or computer vision.

If you liked this post, you might also like:

If you liked the overall message of this post, feel free to get in touch with us. We do speaking engagements — http://www.citadinesgroup.com/#contact

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 282,454+ people.

Subscribe to receive our top stories here.

--

--

Eugene Leychenko
The Startup

Writing about business strategy and well executed development. Running http://www.citadinesgroup.com/ (web & mobile development from NYC/LA)