Shruti Garg
Nov 1 · 4 min read

“ CONCEPT OF RELU ACTIVATION FUNCTION IN A NEURAL NETWORK”

NEURAL NETWORKS
NEURAL NETWORKS

Let’s talk about the term Neural Network first…

NEURAL NETWORKS were originally motivated by looking at the machines which replicates the Brain’s functionality. They are the Non Linear Machine Learning Models which can be used for Supervised and Unsupervised Learning.

The term which is mostly used in Neural Networks is a NEURON.Let’s understand what basically a neuron is.

NEURON

A Neuron is a computational unit which takes the input(‘s), do some calculations and produces the output. Below image will help in understanding what actually a Neuron is.

Understanding of a Neuron

NOW, THE QUESTION ARISES WHAT DOES A NEURON DO IN NEURAL NETWORKS?

Basically,a Neuron multiplies each feature by a weight and then add these values together.

Therefore, the equation which we get is :

Z=(X1*W1) + ( X2*W2) +(X3*W3 )

X1, X2, X3 represents the input value and W1, W2, W3 represents the weights .

This value is then put through a function named ACTIVATION FUNCTION.

ACTIVATION FUNCTION

Also known as SQUASHING FUNCTION which limit the amplitude of the output of the neuron.

Some of the ACTIVATION FUNCTIONS are:

Therefore, the output of the Neuron is the output of the ACTIVATION FUNCTION. The image given below helps in better understanding.

Let’s get familiar with the RELU Activation Function.

ACTIVATION FUNCTION- RELU

RELU stands for Rectified Linear Unit .This Activation simply thresholds at Zero. RELU is now popularly being used in place of Sigmoid or Tanh due to its better property of convergence.

The function is defined as :

f(z)=0 when z<0

f(z)=z when z>0

R(z)= max(0,z)

Range : [0,∞)

RELU Activation Function removes the Negative values. Therefore, value turns out to be either 0 or z.

Now, try differentiate the RELU function w.r.t ‘z’ we get:

f(z)=0 when z<0

f(z)=1 when z>0

Here we get two values 0 and 1.

Now a problem arises when the differentiation of the RELU function is Zero . Sometimes there will be no difference between the old value of the neuron ( i.e.before putting through the activation function) and the new value (i.e. after putting through the activation function) of the neuron, which will be of no use. This will create our “DEAD NEURON” which implies no process is happening . Therefore to fix this problem we have a concept of “LEAKY RELU”.

LEAKY RELU

This function is defined as:

f(z)= z when z>0

f(z)= 0.01(z) when z<0

If we do the derivative w.r.t. ’z’ of the above function it will not be Zero for z<0 , it will be 0.01. therefore it will remove the chances for the old and new value of the neuron to be same.

WHAT HAPPENS IF WE DO NOT APPLY THE RELU ACTIVATION FUNCTION?

1)A Neural Network would simply be a Linear Regression Model, which does not performs good most of the times.

2)Also , without the Activation function the Neural Network would not be able to learn and model other complicated kinds of data such as audio, images etc.

Thanks for reading:)

Shruti Garg

Written by

A Certified Data Scientist

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade