An experiment to test the multi-stream neural network inference performance of DeepStream on Jetson Nano.

Jetson Nano. (Source)

Deploying complex deep learning models onto small embedded devices is challenging. Even with hardware optimized for deep learning such as the Jetson Nano and inference optimization tools such as TensorRT, bottlenecks can still present itself in the I/O pipeline. These bottlenecks can potentially compound if the model has to deal with complex I/O pipelines with multiple input and output streams. Wouldn’t it be great to have a tool that can take care of all bottlenecks in an end-to-end fashion?

Say Hello to DeepStream

Turns out there is a SDK that attempts to mitigate this problem. DeepStream is an SDK that is optimized for NVIDIA…

An elaborate discussion on the various Components, Loss Functions and Metrics used for Super Resolution using Deep Learning.

Written by Bharath Raj with feedback from Yoni Osin.

Photo by Jeremy Thomas on Unsplash


Super Resolution is the process of recovering a High Resolution (HR) image from a given Low Resolution (LR) image. An image may have a “lower resolution” due to a smaller spatial resolution (i.e. size) or due to a result of degradation (such as blurring). We can relate the HR and LR images through the following equation: LR = degradation(HR)

An introduction to the techniques used in Human Pose Estimation based on Deep Learning.

Written by Bharath Raj with feedback from Yoni Osin.

Photo by Alain Pham on Unsplash

A Human Pose Skeleton represents the orientation of a person in a graphical format. Essentially, it is a set of coordinates that can be connected to describe the pose of the person. Each co-ordinate in the skeleton is known as a part (or a joint, or a keypoint). A valid connection between two parts is known as a pair (or a limb). Note that, not all part combinations give rise to valid pairs. A sample human pose skeleton is shown below.

An in-depth tutorial on creating Deep Learning models for Multi-Label Classification.

By now you would have heard about Convolutional Neural Networks (CNNs) and its efficacy in classifying images. The accuracy of CNNs in image classification is quite remarkable and its real-life applications through APIs quite profound.

A comprehensive review of Classical and Deep Learning methods for Semantic Segmentation

Written by Bharath Raj with feedback from Noy Shulman and Rotem Alaluf.

Photo by JFL on Unsplash

Semantic Segmentation is the process of assigning a label to every pixel in the image. This is in stark contrast to classification, where a single label is assigned to the entire picture. Semantic segmentation treats multiple objects of the same class as a single entity. On the other hand, instance segmentation treats multiple objects of the same class as distinct individual objects (or instances). Typically, instance segmentation is harder than semantic segmentation.

A comprehensive review of techniques used to estimate depth using Machine Learning and classical methods.

Written by Bharath Raj with feedback from Rotem Alaluf.

Photo by Osman Rana on Unsplash

Conventional displays are two dimensional. A picture or a video of the three dimensional world is encoded to be stored in two dimensions. Needless to say, we lose information corresponding to the third dimension which has depth information.

2D representation is good enough for most applications. However, there are applications that require information to be provided in three dimensions. An important application is robotics, where information in three dimensions is required to accurately move the actuators. …

A summary of the latest advances in Generative Adversarial Networks

Written by Bharath Raj with feedback from Rotem Alaluf

Art by Lønfeldt on Unsplash

Generative Adversarial Networks are a powerful class of neural networks with remarkable applications. They essentially consist of a system of two neural networks — the Generator and the Discriminator — dueling each other.

Exploring techniques used to fit neural networks in memory-constrained edge settings

Deploying memory-hungry deep learning algorithms is a challenge for anyone who wants to create a scalable service. Cloud services are expensive in the long run. Deploying models offline on edge devices is cheaper, and has other benefits as well. The only disadvantage is that they have a paucity of memory and compute power.

This blog explores a few techniques that can be used to fit neural networks in memory-constrained settings. Different techniques are used for the “training” and “inference” stages, and hence they are discussed separately.


Certain applications require online learning. That is, the model improves based on feedback or…

This article is a quick tutorial for implementing a surveillance system using Object Detection based on Deep Learning. It also compares the performance of different Object Detection models using GPU multiprocessing for inference, on Pedestrian Detection.

Surveillance is an integral part of security and patrol. For the most part, the job entails extended periods of looking out for something undesirable to happen. It is crucial that we do this, but also it is a very mundane task.

Wouldn’t life be much simpler if there was something that could do the “watching and waiting” for us? Well, you’re in luck. With…

An overview of performing Deep Learning on mobile and edge devices.

Photo by Alexandre Debiève on Unsplash

Scalable Deep Learning services are contingent on several constraints. Depending on your target application, you may require low latency, enhanced security or long-term cost effectiveness. Hosting your Deep Learning model on the cloud may not be the best solution in such cases.

Bharath Raj

Exploring Computer Vision and Machine Learning |

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store