Using Open Source to Create a Video Thumbnails Service

Flávio Ribeiro
5 min readJul 31, 2017

--

Originally published at blog.flavioribeiro.com on July 31, 2017. This post lack some demonstrations that are only available there.

Last week The New York Times hosted the 2017’s edition of Makers Week, an entire week dedicated to working on projects and ideas employees want to test, build and innovate on.

There are no boundaries to projects, nor specific scope requirements. You can use your time to do research on new topics or disciplines, contribute to open source projects, fix bugs or create products from scratch. It’s definitely not a new thing and I believe most companies are doing this now so I will not dwell on it. If you want to know more about how this works here at our office, you can read this article from last year’s edition.

Increasing our click through rate

One of our goals as a video team this year is to increase our click through rate when our users see our video player. We usually do a pretty good job of selecting thumbnails for all of our videos — often using photos taken by our own photo journalists. However, after watching a talk from JWPlayer folks and reading a research article from Netflix, it became clear to me that we can’t just assume a thumbnail is good for a given video. We should actually use some data and run A/B tests to come up with the best one.

In order to try to improve our thumbs, we’d need to be able to create them in a cleverer and faster way. So I thought that creating something to generate and serve thumbnails for any of our videos at any time, on-the-fly (on the time of an HTTP request), would be a great project for my Makers Week. When working at Globo.com, I saw some amazing open source projects being created and maintained by some wonderful engineers I met over there so the scope of the project would be nothing more than putting some of those projects together.

Lumberjack

Before starting, I invited Francisco Souza to help me. He’s a specialist when it comes to application deployment and all things related to Docker/Kubernetes/Google Cloud Service. I’m glad he accepted, things were much easier with his help.

As I said before, Lumberjack is a combination of open source projects that allow thumbnails extraction for videos. It leverages the powers of Thumbor, NGINX, nginx-thumbextractor module (or simply módulo do Wanden as we like to say in Portuguese) and the Lua programming language in order to extract and generate pictures for any given moment of any given video of our video library.

Architecture

We are deploying three services in different containers in a Kubernetes pod:

  • NGINX + Video Thumbnails Extractor Module: Responsible for scanning the video on a given mountpoint, extracting the frame of a given timecode and returning it on the fly. We used gcsfuse to mount our production GCS bucket where our videos library reside, allowing the module to go there and get the frame requested.
  • Thumbor: Responsible for applying filters, crops and resizing of images. Thumbor relies on different engines including OpenCV to apply functions passed as query strings on the HTTP request. It includes smart cropping, face and assets detections and a bunch of other cool stuff. You should take a look on this powerful project and see what it can do for you. The Docker image we are using for this service is available here.
  • NGINX + Lua Application: A simple Lua script that runs inside NGINX with this Lua Module. Given a VideoID, the script is able to fetch the right MP4 video asset from our internal API and send to the thumbnails extractor. The application is also responsible for getting the parameters passed from the user on the URL such as filters and resolution, send to Thumbor and return the final picture back to the user.

We decided not to open source Lumberjack. At the end of the day the whole project is just some business logics around the open source projects I mentioned on this post. So really all of the credit belongs to the guys who created and maintain Thumbor and the thumbextractor module. If you want something similar to what we did you can just deploy them.

Some Use Cases

Imagine we want to crop a frame from this video at 30 seconds to use as the cover of another vertical video on mobile phones (9x16), in grayscale, with smart cropping. We just need to pass the below as parameters and the service will make it for you:

lumberjack.nyt.net/video-id?timecode=30&filters=grayscale()&resolution=900x1600&smartcrop=true

Another great feature supported by the thumbnails extractor is the generation of sprites or tiles. We can, for example, generate a sprite map with thumbnails every one second and use it as a moving cover when the user hovers on it:

lumberjack.nyt.net/video-id?sprite=true&size=100&interval=1

Future

The Thumbnails Service is now part of our Q4 roadmap and Lumberjack is already in production. We didn’t present to the newsroom yet as we are still facing some performance issues with caching and the GCS bucket. The NGINX locations are not optimized so we’ll need to look back on them as well. To sum up, we’ll need to revisit a lot of stuff that we did in a rush during Makers week (I don’t even need to mention that we have zero tests for the Lua script too).

For the future, we want to be able to detect perfect looping GIF’s for using them on social channels and also detect highlights of a video based on the audio and closed captions. That way we can suggest thumbnails for our newsroom editors within our CMS. Finally, we’d love to integrate thumbnails tests with our A/B framework the same way Netflix did.

Originally published at blog.flavioribeiro.com on July 31, 2017.

--

--

Flávio Ribeiro

Brazilian, previously Globo.com, NYTimes & ViacomCBS. Engineering Manager at Netflix.