Deep Learning & Computer Vision: A Hybrid Future

Published in

Egen Engineering & Beyond

5 min readAug 12, 2019

In the past few years, the potential of Artificial Intelligence has been gaining significant attention, with institutions & companies racing to leverage its powerful capabilities. The hype surrounding this field has led to concepts such as Deep Learning & Computer Vision becoming part of the mainstream conversation- but most of the people talking about these buzzwords fail to appreciate the intricacies behind machine intelligence.

For instance, Person A might say, “Our company uses Deep Learning to classify something” or “our device uses Computer Vision techniques for object detection”, providing a very broad overview of these concepts. Speaking in such general terms fails to convey the value behind these AI applications.

So what exactly is Deep Learning?

Let’s dive right in!

Deep Learning

In Artificial Intelligence (AI), Deep Learning is a subset of Artificial Neural Network models. Neural networks consist of complex layers that learn data patterns and recognize characteristics so that machine can classify objects, progressively refining data as it passes through each layer.

Deep Learning has 2 main subsets: Supervised Learning and Unsupervised Learning. For simplicity’s sake, we only focus on Supervised Learning in this article. Within Neural Network theory, we can build a simple classifier using less than 5 layers. In cases like these, we don’t consider this a deep learning method. The number of “neurons” (or layers) existing in such a system to be classified as ‘deep’ is very subjective- which is part of the reason so many people have no clue what differentiates Deep Learning.

In simple terms, Deep Learning allows you to analyze large amounts of complex data while continuously refining its processing methods. The various use cases of Deep Learning are numerous- and its potential continues to grow.

Now, let’s take a quick look at what Computer Vision is and why we need to know it!

Image Source : Carnegie Mellon University

Computer Vision In A Nutshell

Computer Vision is a subfield of Computer Science that focuses on how computers process visual data. Computer Vision methods are applied to images & videos to process, analyze and extract real-world data, practically referred to as Image Processing.

Let’s look at this example: even simple actions like converting the original image to Grayscale, or flipping the visual is a part of Image Processing

Now that we understand Computer Vision & Deep Learning as distinct concepts, let’s take a look at how these technologies can be leveraged together

Face Recognition

Deep Learning & Computer Vision may sound like highly complex engineering concepts, but they contribute to technology that touches our daily lives. For example, during Facebook’s early years, tagging people in photos necessitated manually identifying the human elements. Nowadays, when you tag someone in a Facebook photo, Facebook’s software automatically detects people and generates a tag, telling us this is a person.

This feature was made possible by integrating hybrid Deep Learning and Computer Vision methods. According to the pre-learned facial characteristics and shape, our phone dynamically updates what it sees and when it thinks it has found a human face (i.e. right classification) will carry out a given task. In order to increase accuracy, lots of training processes are needed. It takes lots of trial-and-error learning, but the algorithm should be able to increase the True Positive Rate and decrease the False Positive Rate, improving its digital processing capabilities.

Lane Detection

The rise of Autonomous Vehicles would not have been possible without leveraging Deep Learning & Image Processing methods. Companies such as Tesla are aiming for minimal human intervention on the road- which requires responsive real-time machine analysis.

To achieve this goal, objects such as pedestrians, other cars, and roadside obstacles must be detected as if by humans in real-time. Lane detection is another key feature, enabling the machine to decide if it should go straight or make turns based on the lane curvature. Lots of progress has been made to improve autonomous vehicle capabilities, but lots of work needs to be done before this technology becomes generally available.

In the following image, the machine attempted to identify the lane and create an area of interest indicating the right path to follow. Although not 100% accurate, a combination Deep Learning and Image Processing techniques such as Threshold, Gamma Correction, and Morphological Operators helped the machine get pretty close!

Lung Nodule Detection

Deep Learning & Computer Vision have applications across many different industries. For example, applying these technologies in healthcare could help save lives. Doctors receive electronic patient medical records (EMR’s), which include vital information such as X-Ray & MRI history. Doctors can generally diagnose if a tumor is benign or malignant through manual review, but hybridizing Deep Learning and Computer Vision capabilities allows us to autonomize this process and reduce the possibility of human error. To leverage its full potential & obtain efficient and accurate analyses, comprehensive machine training is essential.

For example: This classification identifies the Region of Interest (ROI) and tells us that this is a lung nodule. Further analysis should generate a benign or malignant classification. If the identifiers are misclassified, then the classifier must be trained again until a certain level of confidence interval is satisfied.

As you can see, the number of use cases for Deep Learning & Computer Vision are growing exponentially. Movers are distinguished by a data-first approach that leverages emerging technology in new, innovative ways- no matter the industry. The potential applications of these hybridized processes are unprecedented & point to a promising future.

If you enjoyed this article, feel free to connect with me on LinkedIn and be sure to check out other works on our publication here.