How we launched the search for products through videos

Sarafan Technology Inc.
All about Sarafan.AI
5 min readJun 14, 2019

The story of how we taught Sarafan.AI to recognize products in videos

A little background

In 2016, we developed AI for product recognition in photos. A year ago, we started teaching technology to work with videos. In May, the shop-now-video functionality was officially launched. You can already see how it works on our partner’s site — Glamor TV, below — a screenshot from the scene.

The way it works

The whole process takes place in three stages.

First stage

First of all, the system classifies objects that appear in the video. No matter what size the object and how many milliseconds it appeared in the frame — the AI ​analyses people, glasses of wine, rooftops of houses and even a pot of cactus.

For each object, the system determines the percentage of classification accuracy, that is, how confident it is of its choice. Frames with high accuracy are sent to the second stage of the analysis.

We understand that it is difficult to believe in what remains behind the scenes, so we attached a GIF with the recognition of a live news channel stream, where the AI ​​classifies objects in real-time.

Classification of objects in the live stream

Second phase

The system filters all selected frames — blurry pictures or duplicates (these are the frames in which practically nothing changes) are removed from the sample. In the keyframes, the AI ​​is already working with each object in more detail. For example, with people, it determines gender, age, hair and skin color, in a dress — color, length, style, length of sleeves, print.

Third stage

The system finds all at least remotely similar products in the catalogs of online stores and then filters them based on their similarity. The most similar products are pulled up into a virtual storefront along with a link-transition to the store.

Why do we use it?

We have developed four different formats for all platforms that work with video.

1. A virtual showcase for media and blogs

The sidebar is embedded in the player’s site — a virtual showcase and a “View products” button. Watch the video, if you like something, click on the button, the sidebar pops up with the similar products. The video automatically goes into pause mode.

All products are shown with photos and brief information. By agreeing with the site’s conditions, we can display the price and discounts (if any).

2. Shop-now-functionality for streaming platforms

On streaming platforms, the principle of operation is similar — AI analyses the video with a tape drive and pulls up goods recognized in the frame — headphones, a computer mouse, a T-shirt, a mug.

The streaming page can also add a banner for your advertiser. There is no pre-moderation and pre-recognition. The search in the frame takes 0.1 seconds. Thanks to this speed, we can recognize any ethereal stream.

An example of work on the streaming platform

3. Smart TV

With Smart TV, we had to break our heads a little. We really wanted to reduce the process to a minimum of steps. What we finally came to: to see the goods, click on the “Info” button on the remote. Then use the arrows to view the tape with the goods. Select the desired product with the button “Ok”. A form with a field for entering a phone number pops up. Then, insert the phone number and get a free SMS with a link to the product.

An example of working on Smart TV

In any format, the user doesn’t need to download anything. Everything is pre-configured with the video.

4. Content tagging

Sarafan.AI targets video content: the system analyzes the content, reads the dominant objects and selects relevant ads for them. For example, a video where a dog appears in 60% of the frames, a banner or video advertising of dog food pulls up. This format is designed for programmatic advertising.

Tagging can be customized for specific advertising campaigns: in this case, the advertiser denotes tags — certain objects or situations for placing their advertising. The system finds these objects in the video and embeds advertising in them. As an example, in a video where a person cooks, the AI ​​adds advertising for products or kitchen appliances.

Example of a pop-up ad

5. Video analytics for offline

Not a single advertisement: we offer video analytics for points of sale. The system automatically processes videos from cameras and reads the taste preferences of visitors.

How to monetize?

All formats are monetized by advertising — the site independently chooses the model that suits itself — CPC (pay per click), CPA (pay per action), CPM (pay per 1000 impressions), CPL (pay per lead). On Glamor TV, for example, the functionality is monetized by the CPM model — the advertiser pays 1000 impressions of his product in the widget.

Video analytics are monetized by the subscription model, for which the payment covers the servicing and reports.

And finally, a little about competitors

Similar functionalities are already on Facebook and Instagram: you can put tags with brief information and a transition to purchase of products in downloaded videos. But the AI ​​for this work is not involved, the tags are manually set by the owners of the video.



Sarafan Technology Inc.
All about Sarafan.AI

Тhe blog about what we live and love: Sarafan.AI, technology, marketing, and the tough and exciting start-up life