Artificial Intuition at Work: Part 5.6

Objective: To explain how AI makes predictions

Sandeep Jain
6 min readApr 12, 2018

Good news. You are about to leap frog a lot of laypeople who hype AI without understanding it. Without technical jargon, this blog will conceptualize for you how neural networks make predictions.

Advice: After absorbing the conceptualizing allegory, spend a little while in the last section, connecting the dots.

Beasts in the Raw

By improving the way each floor collaborates, the Oompa Kicchu tribe were able to add more floors to the original tower. Today, a high tower of more than a 100 councils classifies the beast threat. With that many councils, the scouts reverted to taking snapshots (unstructured data) in the Forest of Previously Unseen Beasts. Using snapshots, less information is lost as compared with scout interpreted features (structured data). Raw data enables sophisticated councils to identify nuanced features and improves prediction accuracy.

The final floor of the Tower has the task of producing the classification: threat or no threat. If there is a threat, a flare is blasted out from the top to warn the scouts.

Tower of Propagated Wisdom

How does the Tower of Wisdom ingest the data provided by scouts and make predictions?

How do the councils on each floor work?

The Propagation of Wisdom

In the tower, each floor has a team of council members. These folk gained their wisdom through many grueling years at the Institute of Deep Learning. Each member scours input to find the unique pattern they are specialized to find. Their role with respect to all the other council members on any floor is fixed for life, unique and indispensable to the proper functioning of the tower.

Let’s step through the flow from snapshot to prediction, from ground floor to the top.

Ground Floor

The original photo from scouts is provided to each and every council member at the ground floor. Each team member applies their own individual perspective to weigh upon the photo to form an opinion.

Each person’s perspective is necessarily different from others. Plus, peers on the same floor do not consult each other to form opinions. Therefore, for any given photo, for each team member, their individual opinion is about different aspects. Lots of independent thinking and thinking different. No redundancy. It’s a good way to divide and conquer all the nuances that could eventually imply a threat.

Their collective opinion of the photo is the sole deliverable to the council on the next floor.

Intrinsic, individual Perspectives; Collective Opinions (source)

First Floor

In turn, the first floor does exactly the same thing, EXCEPT that the input is the collective opinion of the ground floor team, instead of the original snapshot.

So, going through it again:

Each council member on the first floor applies their unique perspectives to the ground floor’s collective opinion to form their own opinion.

Horizontal Analogy: Opinions Propagate like Falling Dominoes

Sophisticated Pattern Recognition To the Top

This collective opinion, in turn, is presented to the next council, and so on, all the way to the top council. Perspectives are applied to input to form opinions across all floors. How they are applied can differ for sets of consecutive floors.

In other words, the wisdom of each floor, manifested in opinions, is propagated to the top floor.

The low floors might find coarse patterns, like sharp geometric corners, which higher floors may recognize as more sophisticated features like sharp teeth.

The opinion of the final council is a prediction of threat or no threat, and this results in the flare.

Threat Classification

That’s it. That’s how neural networks work, and you can connect the dots in just a little bit.

The Graduating Class of Councils

How does this propagation result in the right prediction?

Each council is tuned to give opinions that can be precisely interpreted and meaningful to all subsequent councils above. This tuning occurred in the Institute of Deep Learning.

Most of the councils are trained together. They belong to the same graduating class of the Institute of Deep Learning.

Tuned to Interpret

As wisdom (manifested in collective opinions) propagates, the final council’s perspectives are uniquely tuned to interpret the second-to-last council’s opinions to make the prediction. The second-to-last council is one sophisticated bunch of Oompa Kicchu. They receive the propagated wisdom through all the floors.

In the last council, given that the predictions are binary — “threat or no threat”, there are 2 council members. If the tower was trained to classify beast species type, and there were 10 species, then there would be 10 council persons in the last council.

The number of council members on each floor is also chosen carefully during training.

So, how does each council member’s perspective result in a correct collective opinion of the council?

Perspectives develop through training. That’s the topic for the next blog: The Institute of Deep Learning.

Allegory Unveiled

The Tower has many floors of councils, a council has numerous council members as well as a collective opinion, and each council member has a perspective used to contribute an opinion.

Connecting the dots

Archetype diagram for neural networks (source)
  • The Tower of Propagated Wisdom = A Neural Network, post training.
  • Flare = The output of neural network is a prediction about how much the input is each classification. In this case, there are just 2 classes: threat or no threat. Typically, this is expressed as a probability, and if the probability of threat is greater than some threshold like 85%, a flare would be sent out. In face recognition, there could be 1000 different people, and as many classes and probabilities.
  • Council member= A neuron. There are multiple neurons in each layer. A neuron is defined by a fixed number of weights. These weights are learnt.
  • Perspective = A set of numeric weights representing the pattern a single neuron is specialized to detect. The job of the neuron is see if a previously unseen data (like a beast) has this pattern. It does this by applying its perspective on the previous layer’s collective opinion. Recall that real world neural networks have millions of weights. (Part 4)
  • Opinion = A number per neuron that represents how excited the neuron is about finding its unique pattern in the input. That number is called the neuron’s activation. Given an input from the previous layer, each neuron applies its weights on the whole input, then applies a non-linear activation function, to produce a number. How it applies its weights on the input can range from simple multiplication to complex convolutions. Indeed, convolutional neural networks are core to breathtaking prediction accuracies in recognizing a cat in an image.
  • Councils = A layer of neural network. A set of neurons makes a layer. These layers are stacked in a neural network. A neural network with more than 2 layers is called a deep neural network, it implies greater sophistication in pattern recognition, and it is said to do “deep learning”.
  • Collective Opinion = The set of numbers from each layer is the output of that layer, and the input to the next layer.
  • Top Floor = Output layer. The input (snapshot) and output (prediction) layers are called ‘visible’ layers, while the rest are called ‘hidden’. What the hidden layers do is not easily explainable in words.
  • Propagated Wisdom = The official term for a neuron’s activation moving from one floor to the next is called “forward propagation”. “Backward propagation” is used for learning.

Propagation connects neurons across the layers, hence the term — ‘neural networks’.

There are many types of neural networks. Across most of them, propagation of activations using weights and the non-linear function is the cornerstone of their predictive power.

You’ve just learnt how neural networks make predictions. In addition to image related predictions, these concepts apply to audio (speech recognition) and text (handling ambiguity in language) as well.

The next blog will give you the “a-ha” moment for how a machine could do a very human thing — to learn.

--

--