Analytics Automation

Enterprise applications of Artificial Intelligence, Cognitive Computing & Deep Learning

Mo Patel
Making sense
8 min readJul 28, 2016

--

The hype machine of Artificial Intelligence (AI) and Cognitive Computing is in full throttle by vendors, investors, industry analysts, journalists and researchers. As with most over hyped movements, the truth is somewhere in between the peak and the trough. No, we have not achieved Artificial Intelligence, the type of technology that is sentient and learns to learns the physical world on it’s own and makes sense of it all. The truth is that there is significant progress occurring in academia and industry on the journey to achieving AI. Some examples of the progress are AlphaGo beating Lee Sodol, machines out performing humans in labeling objects in images and machines playing ATARI games. Enterprises shouldn’t just find a warm place at the basecamp, they should be part of the AI expedition. Remainder of this article is an introduction for Enterprise decision makers on the topic of Analytics Automation, a stepping stone towards AI.

Enterprises shouldn’t just find a warm place at the basecamp, they should be part of the AI expedition.

Analytics Software Revenue

Many articles and books have been written and talks given on the topic of Analytics led Enterprise transformation. The first stage is typically basic analysis based on techniques such as counting, filtering & aggregating, one would commonly refer to this as business intelligence. The second stage starts with introduction of complex relational algebra & statistical analysis; it culminates in data science: data mining and traditional machine learning. I want to pause here and note that I do not intend to trivialize the first or the second stage. They are very much critical in the analytics journey of an enterprise and have successfully helped deliver tremendous business value to corporations world wide. Most organizations have some ability in the first stage and many are well on their way in the second stage. In this post I want to discuss the next step in the analytics evolution of the enterprise. The third stage is about Analytics Automation.

Tesla Factory (Source: NY Times)

What is Analytics Automation?

Today companies face many challenges with analytics: time to insights, overwhelmed with data, and complex crowded tools marketplace. Automation is needed in Enterprise Analytics to drive the next phase of value creation from data. Techniques and architectures used in AI research mainly Deep Learning can help organizations achieve Analytics Automation by speeding up the cumbersome process of feature engineering. Analytics Automation is implementing systems, which find the signals from the massive troves of data and provide predictions and prescriptions to transform an enterprise.

Why Analytics Automation?

There are several key themes that put us on the door step of Analytics Automation: continued exponential growth in high dimensional data, challenges in the current data science process & inefficiency of existing data mining & machine learning algorithms.

Every 60 seconds, 150 hours of video is uploaded on YouTube! According to the networking giant Cisco, 85% of Internet traffic is pixel based — Images and Videos, add to that data generated from 3D environments measured in Voxels. New data coming as raw material for data scientists is increasingly high dimensional. The combination of growing interconnectedness around the world and explosive growth in mobile devices and sensors capturing not only pixel data but also data from biological, chemical & physical objects has brought us to the era of highly dimensional data; the much hyped Internet of Things (IoT). According to research firm Gartner, these networked sensors are estimated to double the human population and reach 13.2 billion by year 2020. Simple way of thinking about high dimensionality is a single observation having a large number of attributes. Combine the number of devices emitting observations of the world around them with the high dimensional nature of the observations and it quickly becomes a major challenge for Data Scientists.

Source: Unsplash

Speaking of Data Scientists, let’s take a step back and review the current data science process, in somewhat loose terms, it typically starts with transforming raw data for feature engineering, followed by testing of newly derived feature(s) using various algorithms to improve baseline model performance and finally collaborating with data engineering for update to production model using new feature(s). The problem with this process when working with high dimensional data is the feature engineering becomes difficult and model complexity increases. Using current machine learning & data mining techniques, Data Scientists will lose sleep worrying about their models drifting astray due to variability from new data. It becomes difficult to separate the signal that has predictive power from the noise. If lucky the exercise is not futile and it is possible to generate good models, feature engineering surely is time consuming for the highly sought after Data Scientist.

Given this challenging environment of analytics in the enterprise, automation is needed to offload some of the burden back to the machines. In order to do so, we need to understand how humans are able to analyze highly dimensional data at high speeds, learn new concepts, detect patterns and make decisions. To mimic this human like behavior in machines, computer scientists set path towards solving the quest of Artificial Intelligence many decades ago.

How to implement Analytics Automation?

One of the first ANN — Mark 1 Perceptron (Source: Wikipedia)

AI researchers looked to build a system that resembles or perhaps even a replica of the human neural network, giving birth to the concept of Artificial Neural Networks. Conceptually there were many break throughs in the first few decades but no viable practical applications resulted from the research, driving the movement into hibernation: popularly known as AI Winter. Well Winter may last forever in some fictional set of books and a tv series but it is safe to say AI Winter is over! There have been three key developments that have ushered this spring of Artificial Intelligence: The curse of high dimensionality is also a blessing due to the abundance of data to feed these algorithms, computing economies of scale and public-private collaboration in practical applications (loosely speaking the open source movement). All three developments can be credited as dividends of the Internet era, which has led to rise of large technology companies, increased sharing of knowledge among researchers and lowered computing costs.

Deep Learning Network for Image Labeling (Source: Stanford Course CS231N)

To drive Analytics Automation in this AI spring, the area known as Deep Learning has the nearest term application. Deep Learning is the general umbrella under which most of the current AI research is conducted. In order to understand Deep Learning, let’s take a quick detour to understand a simple neural network. In a simple neural network there is a stimulus, processing of the stimulus and the resulting output reaction. Some simple examples of sensory stimulus are tickle generating a laughter response, touching hot cup of coffee generating pain response. The process of converting a stimulus to an output can be called a layer in a neural network, perhaps it’s as simple of one layer of stimulus, processing and response or it could be many layers filtering a complex stimulus to a simplified understandable response. Deep Learning, simply speaking, is applying multiple layers of processing on input data (stimulus) to understand the meaning of the input data. It is a way to model the physical world and extract meaning from the complex high dimensional physical world inputs. Back from the detour and revisiting the why, the need, of Analytics Automation, you can see that applying deep learning techniques on the highly dimensional data to reduce or eliminate feature engineering can be a significant upgrade to the current data science process. Deep Learning facilitates a movement towards feature learning for model development.

In order to accomplish Analytics Automation via Deep Learning, there are three key requirements for an organization:

  • Availability of massive amounts of data
  • Computing hardware for acceleration of complex tasks
  • Access to deep learning software and methods.
TensorFlow.org

Let’s start from the last one first, in what can be described as completely opposite of any game theory or competitive economic theory, there is significant collaboration and perhaps a friendly rivalry among the leading Deep Learning researchers many of who are employed by corporations who most would see as competitors. This collaborative environment has resulted in open availability of software libraries, guides and tutorials that lower the barriers to entry into Deep Learning. Some notable libraries and packages are Google’s TensorFlow, Facebook’s Torch, Microsoft CNTK, Amazon’s DSSTNE, UC Berkeley’s Caffe & Université de Montréal’s Theano. These libraries perform complex tasks that make it easier to build Deep Learning models.

NVIDIA Tesla P-100 GPU

Deep Learning also requires new computing architectures, the one size fits all computing model of CPUs (Central Processing Units) is not a good fit for the large scale learning performed by the above mentioned libraries and packages. Most of the quench for this memory bandwidth thirsty computation is solved by cutting edge GPUs (Graphics Processing Units), however on the rise are Application Specific Integrated Circuits (ASIC) chips. Google has built such a chip called Tensor Processing Unit (TPU) to meet their Deep Learning needs. Organizations looking to enter the age of Analytics Automation have several options: build machines with off the shelf GPUs, buy pre-built systems with GPUs and the easiest way to test out use cases is rent the infrastructure from one of the major cloud vendors.

That leaves the Enterprise AI adopters with quite possibly the hardest challenge towards Analytics Automation, availability of large datasets. Many businesses are evolving their business models around massive data those that are not can look towards the area of simulation to design agents that generate the data needed to build the learning models required for Analytics Automation.

In a series of articles, I will dive into further details on how companies can tackle each of the three Analytics Automation enablers mentioned above.

Just as the first two stages of Enterprise Analytics have accelerated growth and disrupted industries, the era of Analytics Automation should not be dismissed by companies. There is an enormous amount of hype in the fields of Artificial Intelligence and Cognitive Computing, however measured approach to Analytics Automation can significantly alter the future outlook of an organization.

--

--