Neural Network 03 — Neural Network Representation
If you miss my previous lessons or if you are interested in some basics and little more intuitions about Logistic Regression, forward and back propagation, you can visit them using following links.
- Lesson 01 — Prerequisites
- Lesson 02 — Logistic Regression is a solid base
Let’s start our new lesson. If you were following me on my previous lessons, you already have a solid understand about intuitions behind forward propagation and back propagation steps in neural network. That knowledge will definitely help you to move forward with my upcoming lessons in this lesson series.
A simple neural network with a single hidden layer
This is a two layer neural network. (We do not count input layer). In neural network we only see input layer and the output layer. Other layers inside the network are called hidden layers. Usually, we cannot see what’s going on in those layers.
Useful notations:
Above graph consists of lot of information. Please look into it carefully. Because there are lot of very important information in that image. Specially how these dimensions of parameter matrices are formed using layer units/neurons of current and previous layers.
Computing a Neural Network output
Each node/unit/neuron of a Neural Network layer computes two things.
1. Compute linear equation z = wᵀx + b
2. Add non-linearity and get the output
If we consider Logistic Regression, we can represent following diagram to visualize it.
If we expand this on entire layer…
these computations are the same for other units of the layer as well.
Computations corresponding to every node can be represented as follows.
When doing these computations, vectorization is important as we discussed in the previous lesson. This is the vectorized representation of above computations.
Neural Network learning steps
If we consider our 2-layer neural network, each layer computes z = wᵀx + b and then a = σ(z) as output.
Above diagram shows that the output of layer 1 a⁰ feeds into the input of layer 2.
It is important to understand the dimension of outputs of these functions as well.
Vectorizing across multiple examples
So far we have considered three features x₁, x₂, and x₃ in a single training example. Now let’s compute neural network for entire training set m.
These are the 4 equations to be computed.
Let’s understand it through non-vectorized version: We compute every four steps for all training set using a for-loop.
Now let’s see how the vectorized version of above computations look like. Remember, all letters related to vectorized matrices are capitalized.
Bored… 😩? Don’t worry! 😀I will explain 💯
Now the things get little bit complicated isn’t it? So, I think it’s a good time to explain what’s going on with this vectorized implementation.
Explanation for vectorized implementation
If we can understand how Z[1] matrix is formed, then we can understand the other as well.
These are the four vectors we have in Z[1] matrix.
For the simplicity, let’s take w¹xⁱ part and explain this for you.
Recap…
This is the end of the lesson 03. I hope that now you have a solid understanding about Neural Network representation. My next lesson will be Activation Functions. Good luck! Keep learning! 😃