Understanding Non-Linear Activation Functions in Neural Networks

Published in

ML Cheat Sheet

5 min readMay 29, 2020

Back in time when I started getting deep into the field of AI, I used to train machine learning models using state-of-the-art networks like LeNet, VGG, AlexNet, Inception, etc.. But I had some difficulties understanding what is happening inside the network. So I decided to have a look into one of the network’s architecture when I noticed that there are so many activation functions but why do we need them? Which one works better?

I thought that today, after being very comfortable with Neural Nets, it is a good idea to write an article where I can explain activation functions without getting deep into mathematics so people from different levels of expertise can understand activation functions and the need to use them in the Network.

Activation Function

It turns out that to compute interesting functions, a Neural Network needs to pick a non-linear activation function.

What does non-linearity mean?

It means that the neural network can successfully approximate functions that do not follow linearity or it can successfully predict the class of a function that is divided by a decision boundary which is not linear.

Why non-linearity?

Understanding Non-Linear Activation Functions in Neural Networks

Activation Function

Written by Emma Amor