How We Built a Video Templating System Capable of Producing a Million Videos a Month

Many of our customers have thousands of products to sell and find it impossible to manually create advertisement videos for them in a scaling manner. In this blog post, I will explain how we created a video templating system capable of rendering millions of automatically generated videos with ease.

Our video rendering pipeline consists of the video editor, which is used to create the video templates, and the video rendering service, which renders the videos. Smartly.io has had support for Image Templates for a while now, so Video Templates were a logical next step to help our customers automate their creative work. Even after rendering 3 million videos already over the course of the past six months, we still have a long way to go to reach the stunning 145 million renders a day our best-selling Image Templates feature currently handles.

The tech

The Smartly.io product lives on the browser, so our video editor had to be operating in the browser, too. Our team chose to build the video editor using React with Redux. Smartly.io has had the initiative to move on from Angular to React so React was quite an obvious choice for the frontend. For our backend rendering service we used Node.js and PostgreSQL.

We decided to use Typescript for both our backend and frontend because this allows us to reuse a lot of code. As an example, we defined types for both the front- and backend only once in a package that is a dependency for both our rendering service and our editor, instead of doing it separately for both. We are also able to reuse a lot of the video rendering code used in both the front- and backend.

We built our rendering service as a small, independent microservice to enable fast iteration and development, and to enable horizontal scaling. A benefit of the microservice architecture is that it allows us to use whatever tools and languages we what we deem are the best suited for the job. Leveraging the microservice architecture has allowed us to experiment, especially when the service was still early in development, without fear of breaking anything else in the product.

The video editor is released as a npm package to the Smartly.io private npm registry. Deployments to production are done by releasing our updated video-editor to the registry, and then updating the version number of the video-editor package in our main frontend repository. Having a small, isolated codebase keeps our development lightning fast. This means that tests, builds and CI are speedy, as we only need to handle our section of the code. This also helps cultivate great ownership of our section of the Smartly.io codebase within the development team, as everybody in our team knows our whole codebase inside-out.

As our frontend is an independent npm package, local development is typically done using Storybook — a library that provides a development environment that allows us to run our video editor on its own. When developing in Storybook, we are able to easily mock away external dependencies like remote resource fetching and interactions with the main Smartly.io app, as long as these mocks are reflective of reality. A benefit of this approach is that we are able to quickly simulate various cases, such as failing and succeeding network requests, and how our app responds to them.

Multi-repository architecture does not come without its disadvantages. For example, if someone is developing a new feature outside of the video editor repository that requires the video editor to do something new, they would have to first make a pull request to the video editor repository, then update the dependencies in their repository and only then will they be able to use the new function. In a monorepo they could just update the piece of code as a part of the new functionality they are developing. Despite the drawbacks we feel that multi-repository architecture has been the correct choice for our team.

The benefits of utilizing a multi-repository architecture are that it allows us to work on our repository in isolation in a way that is the most effective for our team with the technologies we see are the most viable. It also gives us the opportunity to introduce breaking changes to the API, as the package is versioned. Anyone depending on a deprecated functionality can simply update their dependencies when they are ready for it.

Building a powerful video editor that is easy to use

From the get-go our plan was to build an editor that was not only easy to use, but powerful enough that users would not need to use external programs, like After Effects or other video editors. It also had to be capable to use dynamic content from a product feed. All of our feature additions are considered carefully to maintain the ease-of-use of the editor. At the same time, we try to make sure it is flexible enough to meet the needs of the more advanced power users. In most of our balancing we have leaned more toward the ease of use side as most of our users are not professional video editors, but online marketers.

One interesting balance consideration we had to make was our video editor’s animation system. Most video editors do animations using keypoints where they define the path that the layer travels on the canvas. We implemented animations by using preset animations and eases from the Greensock animation library, which are more user-friendly for someone who is less familiar with video editing. Our layers are built with the idea that layers are always brought on stage by the intro animations and removed from stage by the outro animations.

The editor supports many kinds of layers including image, video, and text layers. It also supports alpha video layers, which means we can avoid building some of the most complex features from other video editors like After Effects, as users can simply import ready-made videos. The layers in our editor are just plain HTML elements, like divs and images, which are manipulated and styled using CSS and Javascript. We also support uploading your own custom fonts, which was a very popular request from our users as many companies have very strict brand guidelines requiring them to use their own fonts.

From a product feed to the video canvas

Our customer needs to have a feed of their products to use our video rendering service. A product feed is in its simplest form a CSV file or JSON with product data in it available somewhere on the internet. The feed is updated by the customer, and our service periodically fetches the updated data and creates new videos for the added products.

This dynamic feed data is used to populate layers in the template. For example, a typical video template might have the product price, discount percentage, name and the image of the product from the feed.

We decided to use liquid template language to reference the data from the feed, because of its versatility and customizability. For example, if the product feed contains a product with { image_url: “www.cool.img/super-cool.png” }, it can be referenced in the layer using the liquid macro {{ image_url }} or if we need to add a currency to the price we can add it using a liquid macro {{ price | append: “€” }} .

Combining a video template and product feed into a video

Our rendering service takes the feed with the dynamic data and renders a video for each item in it. With our self-made rendering system we ensure that what is displayed in the editor will come out the other side as a video exactly like what the editor showed, which is really important to prevent having to verify each and every one of the videos separately.

Rendering videos can be quite resource intensive, so we have built many systems to monitor and lighten the rendering load. The most important factor in lightening the load of our rendering system is heavy usage of caching. We calculate hashes for a video template and products array pairs, and we store them in a Redis cache. If the same video template with the same products is asked for again, we simply return it from the cache. After all, the cheapest renders are those you do not have to do at all!

Wrapping up

This has been a quick look into how we built our video templating system. Building a full fledged video editor with web technologies continues to be an interesting endeavour. There are multiple challenges that we need to keep tackling like for example making sure the editor stays performant, which is not a given as we are using React to render a video canvas. We also need to make sure to keep the editor accessible to everyone, but powerful enough.

We have been able to scale our video rendering system vertically up to this point and we have been quite happy with how it is performing. When the usage of our video templates increases I am sure that we will have host of new challenges to face related to scaling and optimizing our rendering logic.


Would you like to work with us? We’re hiring! Learn more at www.smartly.io/developer.