Understanding The Neural Network Model

In this blog we will look into how a neural network architecture looks like and how information is passed with in each layer of a neural network model.

Picture represents a basic neural network architecture with an input layer x followed by a hidden layer (blue) and an output layer (pink)Β .

The above picture represents a basic neural network architecture with an input layer x followed by a hidden layer (blue) and an output layer (pink). Now lets take a closer look whats happening inside the hidden layer

Inside the Hidden Layer:

Let us say that we are trying to predict whether the customer will buy a specific product or not. The input values come from the input layer (x) . For now we take an example of 4 values [ 197 , 184 , 136 , 214 ] . Assume that these values are 4 different features that affect the decision of the customer for ex: product price , product quality , brand value and satisfaction level of the customer. This set of values makes a vector (You can think of a vector as a list of numbers) which is then passed to our first hidden layer as an input. Now consider the first neuron of the hidden layer , it takes two parameters w₁ and b₁ and outputs some activation value a. For now we can say that the first neuron is implementing some function( sigmoid function in this case) on the given inputs and outputs some value . w₁ and b₁ represent that we are talking about the first neuron in the layer.

g (w*x + b) = g(z) = 1 / (1 + e^-z) . This mathematical formula represents a sigmoid function . This is what we are using for computing the values for each neuron in the layer given a vector .

In this example, these three neurons compute the value using the sigmoid function and output 0.3, 0.7, and 0.2, and this vector of three numbers becomes the vector of activation values a, that is then passed to the final output layer of this neural network.

Now, when you build neural networks with multiple layers, it’ll be useful to give the layers different numbers. By convention, hidden layer(blue) is called layer 1 of the neural network and output layer(pink) is called layer 2 of the neural network. The input layer is also sometimes called layer 0.

In order to introduce notation to help us distinguish between the different layers,I’m going to use superscript [1] to index into different layers. Parameters of each neurons now have w^[1] representing that the these parameters are of layer 1.Similarly, we can add superscripts square brackets like so to denote that followed terms are the activation values of the hidden units of layer 1 of this neural network. Since we had a vector of activation values as an output from layer 1 (hidden layer) already. We will now move to see the computations of layer 2 or the output layer .

Here the output generated from layer 1 acts as an input for layer 2 . Layer 2 functions similarly as the previous layer and generates 1 output. Note that our previous layer had 3 neurons, hence the output we received was a vector of 3 activation values. Now since we have only one neuron in the output layer, we will receive only 1 value as an output. Note that the parameters and the activation values have superscript [2] denoting that these belong to the second layer of our neural network model.

Once the neural network has computed a^[2], there’s one final optional step that you can choose to implement or not, which is if you want a binary prediction, 1 or 0, is this a top seller? Yes or no? You can take the number 0.84 that we computed, and threshold this at 0.5. If it’s greater than 0.5, you can predict y hat equals 1 ( for ex: we predict that the user is going to buy this product) and if it is less than 0.5, then predict your y hat equals 0 ( the user will not buy this product ).

So that’s how a neural network works. Every layer inputs a vector of numbers and applies a bunch of logistic regression units to it, and then computes another vector of numbers that then gets passed from layer to layer until you get to the final output layers computation, which is the prediction of the neural network. Then you can either threshold at 0.5 or not to come up with the final prediction.

I hope this blog provided you a solid understanding on the working of a neural network. Let me know if it helped you and I will be writing more helpful blogs in the future InshAllah. Constructive feedbacks are much needed and appreciated.

Arigato !

--

--