Computer Vision for Busy Developers

Detecting Objects

Vinny DaSilva
14 min readAug 3, 2020

This article is part of a series introducing developers to Computer Vision. Check out other articles in this series.

In my career, object detection and tracking has been one of the hottest topics in Computer Vision. I wish I could dive right into what makes all of it possible, but I learned that object detection and tracking relies on a whole lot of other concepts — most of which we’ve already covered in previous articles. First, let’s first define what these terms mean so that there’s no confusion. Object Detection is the ability to determine if a predetermined object is contained within a given image and it’s location within the overall image (2D Space). Often, but not always, Object Detection can also determine the position, orientation and scale of the object within the 3D space that is represented by the image. The term pose is used to describe the position, orientation and (sometimes) scale of an object in 3D space.

Image Classification is a similar concept where we are trying to determine if a class of objects is located within an image. The big difference is that Image Classification can detect a much broader set of objects and does so in the 2D space of the image. For example, does this image contain a type of “dog” or “cat” (regardless of breed) within this image? As of the time of this writing, determining 3D pose for image classification is still a hard and unsolved problem. We’ll go into more details of Image Classification when we cover Machine Learning…

--

--

Vinny DaSilva

Developer Relations Engineer at Google. Passionate about AR & VR. Previously at Lenovo ThinkReality, Samsung NEXT, Vuforia