Virtual Background With Video Processing APIs of Amazon Chime SDK

dannadori
The Startup
Published in
4 min readDec 29, 2020

Note:
This article is also available here.(Japanese)
https://cloud.flect.co.jp/entry/2020/12/29/152543

Introduction

Amazon Chime SDK JS version 2.3.0 was released on December 21, 2020.
In this release, The new wonderful feature was added, the Video Processing API!

https://github.com/aws/amazon-chime-sdk-js/blob/master/CHANGELOG.md

The Video Processing APIs is an APIs that allows you to edit your own camera’s video (frames) in video conferencing before it is transmitted. It can be used, for example, to create a virtual background that blurs the background of the camera.

From reading the official documentation, the way to achieve this seems to be similar to the way I introduced before. It’s finally official! That’s the feeling I get. I’m excited.

Now let’s try to create a virtual background using the Video Processing API.

How to use Video Processing APIs

As described in the documentation, the Video Processing APIs edits video by pipelining the frames of the input device through one or more VideoFrameProcessors. The Video Processing API provides a class (interface) called VideoTransformDevice that wraps this pipeline processing. So, create an instance of this class by passing an input device and an array of VideoFrameProcessors to its constructor.

This instance of VideoTransformDevice can be passed to Amazon Chime as a virtual video input device, so pass it to Amazon Chime with chooseVideoInputDevice.

That is, we can create the instance as below.

const transformDevice = new DefaultVideoTransformDevice(
logger,
deviceId,
[new SomeVideoFrameProcessorA(), ...] // VideoFrameProcessor implements editing function
);

And, do meetingSession.audioVideo.chooseVideoInputDevice(transformDevice) instead of meetingSession.audioVideo.chooseVideoInputDevice(deviceId)

This time, we will create a single VideoFrameProcessor that replaces the background with another image to achieve the virtual background functionality. Here is how to create a VideoTransformDevice.

const transformDevice = new DefaultVideoTransformDevice(
logger,
deviceId,
[new VirtualBackground()] // VideoFrameProcessor implements virtual background
);

VideoFrameProcessor for Virtual Background

VideoFrameProcessor is a class (interface) with a process method that processes images of frames. The content of this process method identifies the person and the background, and replaces the background with another image. To identify the person and the background, we use Bodypix as described in the previous article.

The process method receives an array of frame data (VideoFrameBuffer) as an argument, so it extracts the image data from here and processes it. It looks roughly like the following.

async process(buffers: VideoFrameBuffer[]){
for(const f of buffers){
const canvas = f.asCanvasElement() // extract image data
// (1)edit image
}
return Promise.resolve(buffers)
}

In the (1) image editing part of the above source code, bodypix is used to identify the person and background, and replace the background. The specific process is described in the previous article, so it is omitted here.

This is all there is to it. What a easy!

Run it!

Now, let’s see how it works. The background has been replaced with an image.

Repository

The source code for this project can be found in the following repository.

Please refer to the README of the repository for how it works.

I am very thirsty!!

Summary

That’s how we implemented a virtual background using the new Video Processing APIs in the Amazon Chime SDK. It’s pretty easy to implement, and we were able to omit quite a few parts that we had been struggling to implement in the past. It’s a very complicated feeling, but since the maintenance is official, it’s a total blessing.

In the repository below, we have released a version with chat and whiteboard functions in addition to the functions introduced here.
Cognito integration has also been implemented. Please take a look at the repository below.

Acknowledgements

The video in the text is from this site.

https://pixabay.com/ja/videos/

--

--