An Executive Primer to Deep Learning

An Executive Primer to Deep Learning

Circa 1997, the reigning world chess champion Garry Kasparov was against an unknown opponent. The opponent was formidable. Garry was not playing a human. He was playing the game with IBM’s behemoth supercomputer, Deep Blue.

Garry had beaten the opponent in the last few games. However, the game played on 11th May 1997 game was different. Garry lost the game. Deep Blue made history:

This game was significant for many reasons. It caught the world imagination. It laid the foundation for many possibilities that will shape the world of AI. Like explorers, data scientists and software engineers embarked on the relatively unchartered territory of Deep Learning.

Fast forward 2018, Deep Learning is a buzz word. According to Gartner, Deep Learning has already crossed the innovation trigger stage. It has reached the stage of the peak of inflated expectation.

It will be another few years before this technology goes mainstream. However, the applications of deep learning have already permeated in our lives.

This article is a primer for deep learning. It attempts to provide a simple explanation of the fundamental concepts. It discusses the reason for its rise and touches upon few applications of Deep Learning.

Let us first classify deep learning in the world of Artificial Intelligence.

As depicted in the figure above, deep learning is a sub-set of Machine Learning. Machine Learning is in itself a subset of Artificial Intelligence (AI).

That intelligence can manifest in many ways. Let us understand how deep learning systems manifests itself.

Imagine a system that enables to identify customer churn for a telco organization. One way to design this system is to craft rules about how to determine who will churn. A series of business rules are hand-crafted for a specific purpose, i.e., identify customers who will churn. Creating lot of rules is an arduous task. There are a lot of factors and their permutations. Rules are also prone to frequent changes. As the customer profile changes or the business model changes, these rules need to be altered.

Another way to identify customer churn would create statistical learning models. They learn it from past churn information. These models take some inputs a.k.a features. These features impact customer churn. They predict if customer churns or not.

These models are Machine Learning models. They learn from the past data and input features. They adapt to the characteristics of input data changes.

Note that these machine learning models rely on humans to provide the input features. For the model to be effective, the input feature needs to be useful. They rely on the intuition and domain knowledge of the modeler. The modeler will have to feed the machine learning model with the correct features. It asks for the right representation of the data.

A traditional machine learning model works fine as long as the representation of the data is congruous to the expected output. However, when the number of potential features grows, identifying right input features becomes a challenge. Machine Learning practitioner also call this challenge the curse of dimensionality. In a traditional machine learning model development, a lot of time is spent on feature engineering.

In the example of customer churn, a lot more features impact customer churn.

Posted on