What is Activation Function in Neural Network?

Ting-Hao Chen
Machine Learning Notes
2 min readNov 9, 2017

Activation function serves as a non-linear function to classify the input data to a different output.

There are two perspectives to look at the activation function by either forward propagation and back propagation.

First, activation function is also called “transfer function”. From the name, we can know that the function try to transfer the input data into a different output in forward propagation. Let’s say we use sigmoid as our activation function. Now we have a dataset from -inf to inf. After we apply the activation function, the dataset is squeezed from -1to 1. Therefore the input data has been transferring (or squashing) into a different one. Why we wanna do that? Well, different activation functions have different reasons. But, in this case, it helps us to squeeze the data into a specific range which is great for predicting probability. If we apply ReLU as our activation function, we could get rid of the negative data. OK! Let’s take a look at the second perspective.

Second, activation function, just like what it is called, is used for activating the neurons in neural network. In back propagation, we need to update the weights and the biases, so that the neural network can “learn”. In order to do that, the derivative of the activation function should be non-zero. Therefore the neural network can update the weights and the biases. However, if the derivative of the activation function is zero, the weights and the biases won’t update. This is like the neurons are dead. As a result, activation function serves as a threshold.

--

--