# Introduction to **Deep Learning**

Deep learning is the subset of machine learning which is inspired by the structure and function of the human brain, also known as Artificial Neural Network (ANN).

It is not a new concept but it dates back to 1943 when Warren McCulloh and Walter Pits developed a computer model based on the workings of neural networks of the human brain. They used a combination of mathematics and algorithms which they called threshold logic to mimic the thought process. After that deep learning has evolved slowly and steadily. During that time a ground-breaking technique called Back Propagation was developed and it was later modified using the well-known chain rule.

Deep learning is still not in a full-bloomed stage but it is now evolving very quickly and we are able to see some new things coming up every month.

Now as we have got some idea about the history of deep learning, let us understand where it stands when we talk about the parent term Artificial Intelligence (AI). To understand the relationship among AI, Machine Learning and Deep Learning the following concentric circles diagram can be referred.

The outermost circle represents AI which is defined as follows:

**“The science and engineering of making computers behave in ways that until recently, we thought required human intelligence.”**

It is a very large area of study of how a machine learns through experience and time. The second circle inside AI shows Machine Learning which is a subset or one major segment of Artificial Intelligence and then comes the area of Deep Learning which is again a subset of Machine learning and it also includes Artificial Neural network (ANN).

Deep learning is the widely used and more approachable name Artificial Neural Network, the “deep” in deep learning means the depth of the neurons used in a network. In the 1990s researchers and scientists were not able to have the full experience of deep neural networks due to various reasons. Some of the reasons are stated below:

1. Lack of availability of Big Data.

2. Low compute power and resources.

3. Less research work availability as compared to the plethora of work available today.

As deep learning is a huge area and a number of research works and studies are still in progress, we will try to understand the most common neural network model used in the field of deep learning also known as multi-layered perceptron (MLP).

The multi-layered perceptron (MLP) is used for solving various complex problems in many industries which includes stock analysis, image identification, spam detection, anomaly detection, face recognition, etc.

What is a **Perceptron**? According to deepai, A Perceptron is an algorithm used for supervised learning of binary classifiers. It is basically a single-layer neural network that carries some numerical information.

Many perceptrons come together to form a complex network of perceptrons which is also known as Multi-layer perceptron. This is how we understand linear and even non-linear patterns and trends in our data.

**Working of MLP**

When many neurons get interconnected to create a network, there is a transfer of information among them, this working is quite similar to how a human brain works which consist of billions of neurons and that is why it is said to be the greatest creation of all time. In MLP there are three types of layers:

**Input Layer**

This is the initial layer of the network which carries the input to reach to the output.

**Hidden Layers**

Hidden layers are all the layers where most of the action takes place. Each neuron is assigned a weight and they pass through many *activation functions* which are nothing but mathematical functions used to compute the outputs of each neuron.

**Output Layer**

In the output layer, the neural network makes the final decision and gives the result.

The MLP uses a feedforward neural network, which means the data we input moves from the left to right in the forward direction from the input layer to the output layer.

As discussed above, the connections between the two layers are assigned **weights**. This element carries very useful information about the neuron and it is very crucial to the learning process of the MLP.

The dark lines connecting the outer neurons of the network depicts that the network is densely and closely connected, meaning that all the neurons in the network are connected to each other.

For example, the calculation for node **h** could be:

**h = i2.w4 + i3.w5**

It is a simple linear equation which shows the computation of the nodes.

**Backpropagation**

This is another ground-breaking technique which is used in order to optimize the weights of MLP using the outputs as inputs.

In a normal multi-layer perceptron, random weights are assigned to the connections and the output that we arrive at tends to differ from the expected or actual output. The difference between the two values is called the **error. **The error is what we need to reduce and reach the output value which is closest to the expected one.

Here Backpropagation comes into play. It is the process of going backwards through the network from output to the input and readjusting the weights automatically and eventually we achieve a result which is closest to the desired one and then we say that the weights in the network are optimal and the error is also minimized.

To do this we make use of gradients or the differentiation of the current node with respect to the previous node. This is repeated until the correct output is produced.

You can deep dive into the workings of backpropagation by reading a few blogs on medium or by taking some courses that are readily available if you Google them. Thank you very much for sticking with me on this sweet journey of understanding what is deep learning and we were also able to get a very basic understanding of the workings of Multilayer perceptron.

References