Visual Brand Detection with Azure Video Indexer

Published in

Microsoft Azure

8 min readMay 15, 2020

TLDR; This post will show how to use the Azure Video Indexer, Computer Vision API and Custom Vision Services to extract key frames and detect custom image tags in indexed videos.

All code for the tutorial can be found in the notebook below. This code can be extended to support almost any image classification or object detection task.

aribornstein/AzureVideoIndexerVisualBrandDetection

Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…

github.com

The tutorial requires an Azure subscription, however everything can be achieved using the free tier. If you are new to Azure you can get a free subscription here.

Create your Azure free account today | Microsoft Azure

Test and deploy enterprise apps Use Azure Virtual Machines, Managed Disks, and SQL databases while providing high…

azure.microsoft.com

What is Azure Video Indexer?

Azure Video Indexer automatically extracts metadata — such as spoken words, written text, faces, speakers, celebrities, emotions, topics, brands, and scenes from video and audio files. Developers can then access the data within their application or infrastructure, make it more discover-able, and use it to create new over-the-top (OTT) experiences and monetization opportunities

Use the Video Indexer API — Azure Media Services

Video Indexer consolidates various audio and video artificial intelligence (AI) technologies offered by Microsoft into…

docs.microsoft.com

Often, we wish to extract useful tags from videos content.These tags are often the differentiating factor for having successful engagement on social media services such as Instagram, Facebook, and YouTube

This tutorial will show how to use Azure Video Indexer, Computer Vision API, and Custom Vision service to extract key frames and custom tags. We will use these Azure services to detect custom brand logos in indexed videos.

This code can be extended to support almost any image classification or object detection task.

Step #1 Download A Sample Video with the pyTube API

The first step is to download a sample video to be indexed. We will be downloading an episode of Azure Mythbusters on Azure Machine Learning by my incredible Co-Worker Amy Boyd using the Open Source pyTube API!

Installation:

pyTube can be installed with pip

!pip install pytube3 --upgrade

Code:

from pytube import YouTube
from pathlib import Pathvideo2Index = YouTube('https://www.youtube.com/watch?v=ijtKxXiS4hE').streams[0].download()video_name = Path(video2Index).stem

Step #2 Create An Azure Video Indexer Instance

Navigate to https://www.videoindexer.ai/ and follow the instructions to create an Account

For the next steps, you will need your Video Indexer

Subscription Key
Location
Account Id

These can be found in the account settings page in the Video Indexer Website pictured above. For more information see the documentation below. Feel free to comment below if you get stuck.

Use the Video Indexer API — Azure Media Services

Video Indexer consolidates various audio and video artificial intelligence (AI) technologies offered by Microsoft into…

docs.microsoft.com

Step #3 Use the Unofficial Video Indexer Python Client to Process our Video and Extract Key Frames

To interact with the Video Indexer API, we will use the unofficial Python client.

Installation:

pip install video-indexer

Code:

Initialize Client:

vi = VideoIndexer(vi_subscription_key='SUBSCRIPTION_KEY',
                  vi_location='LOCATION',
                  vi_account_id='ACCOUNT_ID')

Upload Video:

video_id = vi.upload_to_video_indexer(
              input_filename = video2Index,
              video_name=video_name, #must be unique
              video_language='English')

Get Video Info

info = vi.get_video_info(video_id, video_language='English')

Extract Key Frame Ids

keyframes = []
for shot in info["videos"][0]["insights"]["shots"]:
    for keyframe in shot["keyFrames"]:
        keyframes.append(keyframe["instances"][0]['thumbnailId'])

Get Keyframe Thumbnails

for keyframe in keyframes:
    img_str = vi.get_thumbnail_from_video_indexer(video_id,    
                                                  keyframe)

Step #3 Use the Azure Computer Vision API to Extract Popular Brands from Key Frames

Out of the box, Azure Video Indexer uses optical character recognition and audio transcript generated from speech-to-text transcription to detect references to popular brands.

Now, that we have extracted the key frames we are going to leverage the Computer Vision API to extend this functionality to see if there are any known brands in the key frames.

Brand detection - Computer Vision - Azure Cognitive Services

Brand detection is a specialized mode of object detection that uses a database of thousands of global logos to identify…

docs.microsoft.com

First we will have to create a Computer Vision API key. There is a free tier that can be used for the demo that can be generated with the instructions in the documentation link below. Once done you should get a Computer Vision subscription key and endpoint

Create a Cognitive Services resource in the Azure portal - Azure Cognitive Services

Use this quickstart to start using Azure Cognitive Services. After creating a Cognitive Service resource in the Azure…

docs.microsoft.com

After we have our Azure Computer Vision subscription key and endpoint, we can then use the Client SDK to evaluate our video’s keyframes:

Installation:

pip install --upgrade azure-cognitiveservices-vision-computervision

Code:

Initialize Computer Vision Client

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentialscomputervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

Send Keyframe To Azure Computer Vision Service to Detect Brands

import timetimeout_interval, timeout_time = 5, 10.0
image_features = ["brands"]for index, keyframe in enumerate(keyframes):if index % timeout_interval == 0:
     print("Trying to prevent exceeding request limit waiting {}  seconds".format(timeout_time))
     time.sleep(timeout_time)# Get KeyFrame Image Byte String From Video Indexer
img_str = vi.get_thumbnail_from_video_indexer(video_id, keyframe)# Convert Byte Stream to Image Stream
img_stream = io.BytesIO(img_str)# Analyze with Azure Computer Vision
cv_results = computervision_client.analyze_image_in_stream(img_stream, image_features)print("Detecting brands in keyframe {}: ".format(keyframe))if len(cv_results.brands) == 0:
    print("No brands detected.")else:
    for brand in cv_results.brands:        print("'{}' brand detected with confidence {:.1f}% at location {}, {}, {}, {}".format( brand.name, brand.confidence * 100, brand.rectangle.x, brand.rectangle.x + brand.rectangle.w, brand.rectangle.y, brand.rectangle.y + brand.rectangle.h))

Azure Computer Vision API — General Brand Detection

Quickstart: Computer Vision client library — Azure Cognitive Services

Get started with the Computer Vision client library. Follow these steps to install the package and try out the example…

docs.microsoft.com

Step #4 Use the Azure Custom Vision Service to Extract Custom Logos from Keyframes

The Azure Computer Vision API, provides the ability to capture many of the worlds most popular brands, but sometimes a brand may be more obscure. In the last section, we will use the Custom Vision Service, to train a custom logo detector to detect the Azure Developer Relation Mascot Bit in in the keyframes extracted by Video Indexer.

This tutorial assumes you know how to train a Custom Vision Service object detection model for brand detection. If not check out the If not, check out the documentation below for a tutorial.

Tutorial: Use custom logo detector to recognize Azure services - Custom Vision - Azure Cognitive…

In this tutorial, you'll explore a sample app that uses Custom Vision as part of a larger scenario. The AI Visual…

docs.microsoft.com

Instead of deploying to mobile, however we will use the python client API for the Azure Custom Vision Service. All the information you’ll need can be found in the settings menu of your Custom Vision project.

Installation:

pip install azure-cognitiveservices-vision-customvision

Code:

Initialize Custom Vision Service Client

from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClientprediction_threshold = .8
prediction_key =  "Custom Vision Service Key"
custom_endpoint = "Custom Vision Service Endpoint"
project_id = "Custom Vision Service Model ProjectId"
published_name = "Custom Vision Service Model Iteration Name"predictor = CustomVisionPredictionClient(prediction_key, endpoint=published_name)

Use Custom Vision Service Model to Predict Key Frames

import time
timeout_interval, timeout_time = 5, 10.0for index, keyframe in enumerate(keyframes):
    if index % timeout_interval == 0:
       print("Trying to prevent exceeding request limit waiting {} seconds".format(timeout_time))
       time.sleep(timeout_time)    # Get KeyFrame Image Byte String From Video Indexer
    img_str = vi.get_thumbnail_from_video_indexer(video_id, keyframe)    # Convert Byte Stream to Image Stream
    img_stream = io.BytesIO(img_str)    # Analyze with Azure Computer Vision
    cv_results = predictor.detect_image(project_id, published_name, img_stream)
    predictions = [pred for pred in cv_results.predictions if pred.probability > prediction_threshold]
    print("Detecting brands in keyframe {}: ".format(keyframe))    if len(predictions) == 0:
       print("No custom brands detected.")
    else:
       for brand in predictions:
           print("'{}' brand detected with confidence {:.1f}% at location {}, {}, {}, {}".format( brand.tag_name, brand.probability * 100, brand.bounding_box.left, brand.bounding_box.top, brand.bounding_box.width, brand.bounding_box.height))

Conclusion

And there we have it! I am able to find all the frames that have either Microsoft for or the Cloud Advocacy Bit Logo in my video.

Next Steps

You now have all you need to extend the Azure Video Indexer Service with your own custom computer vision models. Below is a list of additional resources to take that will help you take your integration with Video Indexer to the next level.

Offline Computer Vision

In a production system, you might see request throttling from a huge number of requests. In this case, the Azure Computer Vision service can be run in an offline container

How to install and run containers - Computer Vision - Azure Cognitive Services

Containers enable you to run the Computer Vision APIs in your own environment. Containers are great for specific…

docs.microsoft.com

Additionally, the Custom Vision model can be run locally as well.

Tutorial - Deploy Custom Vision classifier to a device using Azure IoT Edge

Azure IoT Edge can make your IoT solution more efficient by moving workloads out of the cloud and to the edge. This…

docs.microsoft.com

Video Indexer + Zoom Media

Azure-Samples/media-services-video-indexer

Update June 17, 2019: Added parameter for setting the Zoom Media language separately. Also now using the new PowerShell…

github.com

Creating an Automated Video Processing Flow in Azure

Creating an automated video processing flow in Azure

Processing video is that kind of scenario that fits perfectly for cloud computing, right? I mean, it usually requires a…

fabriciosanchez-en.azurewebsites.net

About the Author

Aaron (Ari) Bornstein is an AI researcher with a passion for history, engaging with new technologies and computational medicine. As an Open Source Engineer at Microsoft’s Cloud Developer Advocacy team, he collaborates with Israeli Hi-Tech Community, to solve real world problems with game changing technologies that are then documented, open sourced, and shared with the rest of the world.

Visual Brand Detection with Azure Video Indexer

aribornstein/AzureVideoIndexerVisualBrandDetection

Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…

Create your Azure free account today | Microsoft Azure

Test and deploy enterprise apps Use Azure Virtual Machines, Managed Disks, and SQL databases while providing high…

What is Azure Video Indexer?

Use the Video Indexer API — Azure Media Services

Video Indexer consolidates various audio and video artificial intelligence (AI) technologies offered by Microsoft into…

Step #1 Download A Sample Video with the pyTube API

Step #2 Create An Azure Video Indexer Instance

Use the Video Indexer API — Azure Media Services

Video Indexer consolidates various audio and video artificial intelligence (AI) technologies offered by Microsoft into…

Step #3 Use the Unofficial Video Indexer Python Client to Process our Video and Extract Key Frames

Step #3 Use the Azure Computer Vision API to Extract Popular Brands from Key Frames

Brand detection - Computer Vision - Azure Cognitive Services

Brand detection is a specialized mode of object detection that uses a database of thousands of global logos to identify…

Create a Cognitive Services resource in the Azure portal - Azure Cognitive Services

Use this quickstart to start using Azure Cognitive Services. After creating a Cognitive Service resource in the Azure…

Azure Computer Vision API — General Brand Detection

Quickstart: Computer Vision client library — Azure Cognitive Services

Get started with the Computer Vision client library. Follow these steps to install the package and try out the example…

Step #4 Use the Azure Custom Vision Service to Extract Custom Logos from Keyframes

Tutorial: Use custom logo detector to recognize Azure services - Custom Vision - Azure Cognitive…

In this tutorial, you'll explore a sample app that uses Custom Vision as part of a larger scenario. The AI Visual…

Conclusion

Next Steps

Offline Computer Vision

How to install and run containers - Computer Vision - Azure Cognitive Services

Containers enable you to run the Computer Vision APIs in your own environment. Containers are great for specific…

Tutorial - Deploy Custom Vision classifier to a device using Azure IoT Edge

Azure IoT Edge can make your IoT solution more efficient by moving workloads out of the cloud and to the edge. This…

Video Indexer + Zoom Media

Azure-Samples/media-services-video-indexer

Update June 17, 2019: Added parameter for setting the Zoom Media language separately. Also now using the new PowerShell…

Creating an Automated Video Processing Flow in Azure

Creating an automated video processing flow in Azure

Processing video is that kind of scenario that fits perfectly for cloud computing, right? I mean, it usually requires a…

About the Author

Written by Aaron (Ari) Bornstein