Neural Networks
Neural Networks (NN) is one of the most widely used concepts today in the world. Neural Networks are also known as Artificial Neural Networks(ANN).
So what are these concepts and how do they work? Continue Reading!
Artificial Neural Networks or better known as Neural Networks is defined as follows
Neural networks are a series of algorithms that mimic the operations of a human brain to recognize relationships between vast amounts of data.
1. Neuron
Before discussing neural networks, we have to discuss the basic building blocks of a NN. Every NN consists of Neurons. The structure of a neuron is as follows
As shown, every neuron will have the weights and features as inputs. If the neuron is in the first layer of the network, then it will have the “actual features” of the dataset as its features else if it is not in the first layer then it will have the output of other neurons from the previous layers as its features. Each layer of the NN will have particular weights associated with it.
The processes each neuron does are
- Linear calculation: In this part, the neuron takes the weights and multiplies the input feature values with them, and sums them up. Since this is a straightforward calculation, it does not compute any complex functions. Hence this is not useful if left as it is.
- Non-Linear Calculation: In order to compute various complex functions, we must have some kind of non-linear calculation taking place. This is where activation functions come into action. After calculating the Linear part, the neuron takes that value and feeds it to some activation function. The activation function does some further calculations on the given value and then outputs that.
2. Activation Functions
To simply put, activation functions are functions that are used in NN to learn complex patterns in data. Now, another question to answer is whether it is mandatory to use activation functions? The answer is YES! activation functions are the most vital components of an ANN. Without them, we will just be calculating linear data which is of no use.
What are the most commonly used activation functions? Let’s see
ReLU
ReLU stands for “Rectified Linear Unit”. It is the most commonly used activation function.
- When the value is negative it returns zero
- When the input value is positive, it returns the same value again
Sigmoid
The sigmoid function takes any real number as input and returns a value between 0 and 1. The sigmoid function is also symmetrical. So it returns a value of 0.5 when the input is 0.
3. Structure of a Neural Network
Every NN follows a standard structure. It consists of 3 parts
- Input Layer
- Hidden Layers
- Output Layers
Input Layer
The input layer is the layer where the “actual features” of the dataset/problem are fed to the network. The number of nodes in the input layer equals the number of features of the dataset/problem. One question that has troubled everyone is whether to count the input layer as one of the layers.
Input layer should NOT be considered as one of the layers.
The count of the number of layers always starts from the first hidden layer.
Hidden Layers
The hidden layer is the place where all the actual and complex calculations take place. The actual learning of a NN takes place in these hidden layers. There can be any number of hidden layers in a NN. But having too many layers can introduce complications like overfitting.
Each hidden layer has its own associated weight vector which is unique to that particular layer. These weights are tuned during backpropagation to minimize the cost and increase accuracy.
Preferred activation functions include ReLU, Tanh
Output Layer
The output layer is the final layer of any neural network. This layer gives the user the final result after all the learning the model has done. The commonly used activation function in the output layer is the Sigmoid function. This function is preferred for binary classification problems since it outputs 0 or 1.
The output layer is included and counted as one of the layers of the NN.
To sum up, Neural Networks are one of the most powerful techniques out there and calculations associated with these layers can be tweaked to produce better results for various applications.
Check out my other articles 👇