Getting started with Neural Networks

Published in

ACM VIT

4 min readAug 23, 2022

What is a Neural Network?

Simply put, Neural Networks are nothing but mathematical functions that helps in obtaining the desired output from a given set of inputs. The structure of a neural network is extremely similar to that of biological neurons.

The basic working of neural networks is to find the output using weights, biases, and activation functions in each layer. Since the result largely depends upon the weights and biases, the neural network is trained to find the minimum weights and biases.

Each iteration has two parts:

A feed-forward, where we find out how well we are doing with respect to the training of the network. Then, we calculate the loss function which helps optimize the weights and biases.
Second, we use backpropagation to update the weights and biases to propagate the error backward.

This is achieved by applying gradient descent algorithms, i.e., the derivative of the loss function concerning weights and biases.

To work with the images, we use a Convolutional Neural Network (CNN), as it is state-of-the-art in object recognition and detection. Moreover, CNNs have been shown to generalize tasks, such as scene classification, using transfer learning and words, represented with an embedding model.

A neural network has the following components:

Input layer: x
n number of hidden layers.
Output layer: y
Weights and biases for all layers
Activation functions for each layer; can vary from layer to layer

Considering a two-layer Neural Network with one hidden layer, the output will be:

y = a ( w2a ( w1x + b1 ) b2

Where a is the activation function, w1 is the weight of layer 1, w2 is the weight of layer 2, b1 is the basis of layer 1, and b2 is the basis of layer 2.

The output y largely depends on the weights and biases. Hence we need to find a minimum W and B. This process of finding the minimum weights and basis is called training the neural network.

Therefore, each iteration has two steps:

A feedforward, calculating the output y ( also known as the cost function in case of hidden layers)
A backpropagation, which helps update the weight and biases

What happens in feedforward?

Taking the above 2-layer network, the output is:

y = a ( w2a ( w1x + b1 ) b2 )

Now, we need to calculate our performance with respect to the training example. Hence, the loss function comes into play.

Loss Function:

Many loss functions are available, like the sum of squares and logarithmic.
The formula that represents it:

J ( w , b ) = ( L ( y , yk ) / m
J ( w , b ) = — ( yk log ( y ) + ( 1 — yk ) log ( 1 — y ) ) / m

Here, J is the loss function.
The weights and biases themselves are not necessarily always optimized.

Backpropagation:

We need to update our weights and biases to propagate our calculated error. To achieve this, we need to know the derivative of the loss function concerning weights and biases.

Once we calculate the derivative, we perform gradient descent, i.e., increasing or decreasing the weights and biases.

The above two steps are repeated until the error is minimized or we reach the maximum number of iterations.

There are different types of neural networks, like fully connected neural networks, convolutional neural networks, recurrent neural networks, etc.

Fully connected neural networks are the simplest form of neural networks. In this type of neural network, each node in the input layer is connected to every node in the output layer.

Convolutional neural networks are similar to fully connected neural networks, but with a few additional features. In a convolutional neural network, the nodes in the input layer are connected to a small number of nodes in the output layer.

This makes the convolutional neural network more efficient and faster.

Recurrent neural networks are similar to fully connected neural networks, but with a few additional features. In a recurrent neural network, the nodes in the input layer are connected to a small number of nodes in the output layer. This makes the recurrent neural network more efficient and faster.

There are different types of activation functions, like sigmoid, tanh, ReLU, etc.

Sigmoid: The sigmoid function is a non-linear function that maps the input to the output. The output of the sigmoid function is between 0 and 1.

Tanh: The tanh function is a non-linear function that maps the input to the output. The output of the tanh function is between -1 and 1.

ReLU: The ReLU function is a non-linear function that maps the input to the output. The output of the ReLU function is between 0 and 1.

In Conclusion

Neural networks are a type of machine learning algorithm that is used to model complex patterns in data. Neural networks are similar to other machine learning algorithms, but they are composed of a large number of interconnected processing nodes, or neurons, that can learn to recognize patterns of input data.

Neural networks are trained using a variety of different techniques, including backpropagation, which adjusts the weights of the connections between the nodes based on the error in the output of the network.

Neural networks have been used to model a wide variety of complex patterns in data, including facial recognition, speech recognition, and machine translation. Neural networks are a powerful tool for machine learning, but they are also complex algorithms.

If you are just getting started with machine learning, you may want to start with a simpler algorithm such as logistic regression or support vector machines.