Utilizing Nvidia Triton for video processing on Hivecell

Dasha Korotkykh
Hivecell
Published in
2 min readJul 1, 2020

One more time we are back to the video recognition case study, this time testing heavy load processing with Nvidia’s Triton Inference server (TensorRT before release 20.03).

Setup:

a) Traditionally, Raspberry Pi camera

b) Hivecell node equipped with MQTT Proxy, Kafka Broker, Kafka Connect, Confluent Replicator, Kafka Streams app, and already mentioned Triton.
Apart from the core apps shown on the scheme, Hivecell runs Java 11 (HTTP client support), Java CV, and ND4J responsible for reshaping and normalizing (converting three-dimensional data package to two-dimensional results in reduced app workload).

c) Confluent cloud (could be any other cloud storage) for a recognized-frames topic.

The full source code is available at the project repository.

Demo scheme

The demo inputs were scaled up to 50 fps, producing much larger volumes of raw data. These images are sent as byte arrays to a pi-cam topic with MQTT Proxy, from where Kafka Stream reads, pre-processes, and normalizes them.

26 fps (1280х960 resolution)
50 fps (640*480 resolution)

After Inference is run over HTTP, Triton post-processes and parses the result. That’s the moment when the image becomes JSON containing a list of recognized objects, minimizing the required storage size up to 1000 times.
The eventual result is written to the recognized-frames topic by Confluent Replicator and then sent to the cloud.

One could ask, why would someone want to parse video at the rate of 26 fps with 1280х960 resolution? This might be an overhead for the common security and manufacturing routines, but as the demonstration of performance it puts a new benchmark on video processing capabilities with Hivecell — this setup could process five streaming feeds simultaneously with the same efficiency.

--

--