Only Numpy: Why we need Activation Function (Non-Linearity), in Deep Neural Network — With Interactive Code
So for today, I don’t want to do something to over complicated, rather a simpl proof with code. Lets get right into it, so why do we need activation function?
That’s it we need activation function because of the above reason. If you want to see in detail why read the blog post.
Network Architecture + Forward Feed Process
So there are two things to note here,
1. We are using IDEN activation function, and as seen above. It just returns whatever the input was. And the derivation is just 1. Below is a python implementation.
2. As seen in the right, it is standard Neural Network, nothing fancy.
As seen above, we are using L2 cost function.
Also, as seen above, the dimension for 3 weights we have are (3*4), (4*10), and (10*1). And our input matrix is shown below.
So if we do the math.
Layer_1 = x.dot(w1) → (4*3)(3*4) →(4*4)
Layer_1_act = IDEN(Layer_1) →(4*4)Layer_2 = Layer_1_act.dot(w2) → (4*4)(4*10) →(4*10)
Layer_2_act = IDEN(Layer_2) → (4*10)Layer_3 = Layer_2_act.dot(w3) →(4*10)(10*1) →(4*1)
Layer_3_act = IDEN(Layer_3) → (4*1)Cost = np.square(Layer_3_act-Y).sum() * 0.5
Standard back propagation with Vectorization, nothing special. However please do note where I wrote in Red →Those are the places where we take the derivative of our Activation Function IDEN and it gives 1.
Forward Feed Version 2
As seen above, since our activation function is linear. We can use a neat trick to make the WHOLE Network into one simple line of math!
Here is the link to the code.
Now lets take a look at each part, one by one.
Above is standard forward feed and back propagation, nothing special and below are results.
As seen 100% accuracy, (when rounded), now lets into the fun stuff. Lets calculate the K value.
As seen above, we can calculate the K value by simple dot product, and just perform x.dot(k) and we get the same results!