Deep Neural Networks(DNN) Part 1 : Representation and Forward Propagation

Parth Dhameliya
6 min readApr 3, 2020

--

Deep Neural Network is a algorithm used for classification and regression both.For example, Image classification of cats or dogs where machine need to predict whether an given image is cat or dog.Neural Networks today are widely used for Image-Classification,Speech Recognition,Object Detection,Text Classification and so on.

There are many types of Neural Networks :

  1. Deep Neural Networks (DNN)
  2. Convolutional Neural Networks (CNN)
  3. Recurrent Neural Networks (RNN)
  4. Generative Adversarial Networks (GAN)

These article series is going to be of Seven parts.

  1. Deep Neural Networks (DNN) Part 1 : Representation and Forward Propagation.
  2. Deep Neural Networks (DNN) Part 2 : Activation functions and Initialization of Weights and Biases.
  3. Deep Neural Networks (DNN) Part 3 : Back Propagation.
  4. Deep Neural Networks (DNN) Part 4: Implementation from Scratch.
  5. Deep Neural Networks (DNN) Part 5: Dealing with Over-fitting and Under-fitting.
  6. Deep Neural Networks (DNN) Part 6: Optimization Algorithms and Hyper parameter tuning.
  7. Deep Neural Networks (DNN) Part 7: DNN Using TensorFlow2.0.

In DNN Part 1 we are going to discuss about the basic notation which we are going to use in these series and we will do forward-prop.

In these article we are going to start with DNN Part 1 : Representation and Forward Propagation.

Before going to DNN Part 1, I would recommend to go for Logistic Regression as it is the basic 1 layer 1 neuron neural network. To know better idea about logistic regression you can visit on these link https://medium.com/@pdhameliya3333/logistic-regression-implementation-from-scratch-3dab8cf134a8

Lets start it step by step.

Representation :

  1. 1 Layer 1 neuron Neural Network architecture(Logistic Regression) :

Suppose we want to predict whether the patient has diabetes or not based on given data.

So here output ŷ∈{1,0} whether the person has diabetes i.e [1] or person not having diabetes i.e [0]

For example, we are having patient diabetes data.For easy understanding I used only 3 features.

Notations:

Simple one layer one neuron(Logistic Regression)

X = Features or Input variables[Glucose,BloodPressure,Insulin] -> [X1,X2,X3]

y = Target or output variable[Outcome]->[y]

ŷ = Predictions

W = Weights[W1,W2,W3]

b = bias

Now our prediction function will be given as follow:

Z = X1(i)*W1 + X2(i)*W2 + X3(i)*W3 + b (linear equation)

a = σ(Z) (Sigmoid activation)

ŷ = a (Prediction function) , where ŷ∈{1,0}

Now we will do Vectorization to convert these equations into matrices:

Z = W*X+b (linear equation)

a = σ(Z) (Sigmoid activation)

ŷ = a (Prediction function) , where ŷ∈{1,0}

Here the dimension of X will (n_x,m) where n_x = no.of features i.e 3 and m = no. of training examples.

In these articles we are just going to focus on representation and forward propagation of neural network.We are going to discuss about what is weights (W) and bias (b) and its values in DNN part 2 and 3 in detail.

For those who have seen my logistic regression article in that the weights were the(Θ1,Θ2,Θ3,….Θn) and bias(Θ0).

2. Deep Neural Network architecture :

Deep Neural Network

To make it easier we are going to use different notations to represent the neural network.

Layers : You can take any number of Hidden Layers in neural network.For example in the above image there are 3 Hidden layers.In neural network input layer is not considered as Hidden Layer.

node or neurons or hidden units (n): No. of nodes in Hidden layers

For eg. In Layer L=1 there are 3 nodes or hidden units n[1] = 3.Here input layers is X matrix.Output Layer is the ŷ = a1[3]∈{1,0}, for a while think only about binary class classification.We are also going to discuss about Multi Class classification later parts.

For easy understanding we are going take small neural network of 2 layers.

Here we are having 2 hidden layers.Layer L=1 contains three nodes or hidden units n[1] = 3 and L=2(output layer) contains 1 nodes.

Now lets talk weights and bias.Initially weights and biases are taken as random and with the help of optimization algorithms better value of weights and bias gets updated as error decreases to minimum,we will discuss in detail about back propagation part.For a while don’t think about the values of weights and bias , think about how we are going to represent it.

Lets see how we can make the computation equation easy using vectorization to convert equations into matrices that we seen in 1 Layer 1 neuron Neural Network architecture(Logistic Regression).

Activation of node a1 of layer 1

As you can see in the above image how we can compute node a1 of layer 1.

Similarly for a2 of layer 1.

Activation of node a2 in layer 1

Similarly for a3 of layer 1.

Activation of node a3 in layer 1

Similarly for a1 of output layer i.e layer 2

activation of node a1 of layer 2

Now we are going to stack all the variable into matrix form.

We are going to initialize Weights W and Bias b with random values.So to create the random values of Weight matrix and bias vector we will pass the following dimensions:

Dimension of W[L] i.e Weight Matrix will be (no. of nodes in next layer,no. of nodes in previous layer).

W[L] dimension will be (n[L],n[L-1])

Dimension of b[L] i.e bias vector will be (no. of nodes in next layer,1)

b[L] dimension will be (n[L],1)

Dimension of weight matrix and bias matrix

For L layer computation :

So far we had seen how forward propagation works and how we can represent the neural network.

We had used sigmoid activation in every layer. In the next part of DNN I will show different activation's that can be used in neural network which are better than sigmoid function.

We will also discuss what are the different methods to initialize random values of weights and biases.

--

--