# THE BEGINNER’S GUIDE TO DEEP LEARNING

A SCRAMBLED PIECE OF RUBIK’S

Deep Learning in its simplest form is just a branch of Machine learning that takes on learning representations from data that put emphasis on learning successive layers of increasingly meaningful representations. “Deep does not infer how deep the process is or how deep the understanding achieved in the process is. Instead, it just represents the depth of successive layers through which data is being filtered, each posing a new representation for the input data.

It uses networks which are just sets of distillation layers where information go through successive filters and come out increasingly cleansed.

Let’s take a good look at the below image;

Above, is a representation of a neural network and how deep learning works. Here, we have an input data of an image of a handwritten number (4). The task here is to find a meaningful representation of the image and hence classify the image properly in regards to the number which it represents.

For each layer the input data passes through, there is a new representation for it. At the end of the last layer’s operation, the output is compared to the expected result. Of course, at first try, we would expect to have an output which would vary highly from the expected result. We can assume this difference to be its Loss Function which can be mathematically represented as YY’. Where Y is the actual expected result and Y’ is the output from our network.

Then, how do we reduce this loss?

First, we need to understand that each and every Layer is composed of its own basic properties which are: the weight, W (a random Tensor) and a bias vector, b.

Mathematically, the result of the output, Y’, due to each Layer is a rectified linear unit of the dot product between the input data, x(Tensor) and the Weight,W(Tensor) which is a property of the layer in addition to a bias vector, b, which is also a property of the layer.

Let our output be Y’,

Y’ = relu( dot( W, x) + b)

Note: relu = max(X, 0) = rectified linear unit

Mathematically,

Y’ = maximum(((W*x) + b), 0)

Where x is our input data, W is the layer’s weight property, b is the layer’s bias property.

It is also very important to note that the input data is a Tensor representation of the original given data, be it image, video or otherwise. Tensors are just data stored in multidimensional arrays, almost always numerical data.

Back to our previous question, how do we reduce the LOSS?

This is done by trying to adjust the weight property of the layer by using the loss function as a measure since the weight is just a Tensor of random elements with shape same as our input data.

This is the training loop that is repeated a sufficient amount of time until the output, Y’ gets as close as possible to the target result, Y. We would cover this better in the next chapter of this series.

This is all what Deep Learning is all about.

Another clear way to interpret Deep Learning to a Lay Man is by considering my example of a scrambled piece of Rubik’s Cube.

Here, we can assume the scrambled piece of Rubik’s Cube to be our input data and the different colors contained to be our class of data in a classification task. The work of the neural network is to find a geometric transformation of the cube that will unscramble it.

In Deep Learning, this is achieved through a series of simple steps in a 3D space such as those you will apply with your fingers. Hence, its all about giving simple representations to a highly complex data.

Written by