What are Neural Networks?

Published in

CodeX

7 min readSep 24, 2021

“The neural network is this kind of technology that is not an algorithm, it is a network that has weights on it, and you can adjust the weights so that it learns. You teach it through trials.” — Howard Rheingold

As a continuation of the series, we will look at Neural Networks in this post. This article is meant to guide a complete novice everything there is to know about neural networks. Some of this should be reviewed if you already have some expertise, but yeah you probably gain something out of it.

In this article, we’ll understand what neural networks are, what they’re used for, and how they operate, which you’ve all been waiting for!

You don’t need to be technical or have a degree in math or computer science to achieve this; I’ll focus on the basics, and I may publish a more in-depth essay about the underlying math in the future. Let’s get this party started!

What are Neural Networks?

Artificial neural networks, also known as neural networks (NNs) or simulated neural networks (SNNs), are a subset of machine learning that provide the core of deep learning techniques. Neural Networks are the algorithms that try to mimic the brain. These are inspired by, but not the same as, the biological neural networks that make up animal brains. Such systems “learn” to execute tasks by examining examples, and are typically not coded with task-specific rules.

They learn by looking at examples of an object, such as a cat or a painting, and identifying important characteristics that will allow them to identify this object in future photographs. These networks are not required to know anything about the object being analyzed. They are intelligent enough to look at a few samples and quickly classify things, predict things, and so on.

Now that you know what neural networks are, let’s look at how they work.

Why use Neural Networks?

With their amazing capacity to draw meaning from complex or inaccurate data, neural networks may be used to identify patterns and discover trends that are too complex for people or other computer algorithms to detect. A trained neural network may be regarded of as a “expert” in the category of data it has been tasked to analyze. This expert may then be used to make predictions in response to new scenarios of interest and to answer “what if” queries.

How do Neural Networks work?

This explanation is going to split into 4parts; the perceptron, forward propagation, activation function, the backpropagation algorithm.

The Perceptron

The perceptron is the most basic type of neural network. It was initially made to understand the brain better, and is modelled on the neuron. Developed by Rosenblatt, it is the foundation of nearly all neural networks today.

The perceptron is a supervised learning algorithm for binary classifiers in machine learning. A binary classifier is a function that determines whether an input, represented by a vector of numbers, belongs to a particular class. It is a linear classifier, or a classification method that predicts based on a linear predictor function that combines a set of weights with the feature vector.

Perceptrons may be viewed as such building blocks in a single layer of a neural network, consisting of four distinct parts:

Input Values or One Input Layer
Weights and Bias
Net sum
Activation function

Forward Propagation

The above network takes numerical inputs X1 and X2 and has weights w1 and w2 associated with those inputs. Additionally, there is another input 1 with weight b (called the Bias) associated with it.

Following the map of how a perceptron operates is simple: add a bias (b) to the net sum of the weighted inputs (product of each input from the preceding layer multiplied by their weight[wᵢxᵢ]). Inputs can originate from either the input layer or perceptrons in a preceding layer. The weighted net sum is then sent through an activation function, which standardizes the value and returns a value of 0 or 1. The perceptron’s choice is then passed on to the next layer for the next perceptron to use in their decision. Together, these pieces make up a single perceptron in a layer of a neural network.

Activation functions

Source: Antonio Rafael Sabino Parmezan from Researchgate

The neuron’s output Y is calculated as shown in the figure above. The non-linear function f is known as the Activation Function. The activation function’s goal is to add non-linearity into a neuron’s output. This is essential since the majority of real-world input is nonlinear, and we want neurons to learn nonlinear representations. Every activation function (or non-linearity) starts with a single integer and then applies a defined mathematical operation on it.

In reality, you may encounter the following activation functions:

Sigmoid: takes a real-valued input and squashes it to range between 0 and 1. And the formula for sigmoid is as follows: σ(x) = 1 / (1 + e⁻ˣ)

Softmax function: In classification tasks, we often employ a Softmax function as the Activation Function in the Multi Layer Perceptron’s Output layer to ensure that the outputs are probabilities and add up to 1.
The Softmax function takes an unbounded vector of real-valued scores and squashes it to a vector of values between zero and one that total to one.
As a result, in this situation, Probability (Pass) + Probability (Fail) = 1.

tanh: takes a real-valued input and squashes it to the range [-1, 1]. The formulation is as follows: tanh(x) = (2/(1+e⁻²ˣ)) - 1

Comparison between Sigmoid and Tanh activation function

ReLU: ReLU stands for Rectified Linear Unit. It takes a real-valued input and thresholds it at zero (replaces negative values with zero). Formula for ReLU is as follows: f(x) = max(0, x)

The below figures shows several other activation functions:

Importance of Bias: The main function of Bias is to provide every node with a trainable constant value (in addition to the normal inputs that the node receives). See this link to learn more about the role of bias in a neuron.

Backpropagation

Backpropagation is the technique used to change the weights and biases, so that the perceptron’s output gets more accurate. To compute the gradients, we calculate the total error at the output nodes and propagate these errors back through the network using Backpropagation. Then, using an optimization algorithm like as Gradient Descent, we ‘adjust’ all weights in the network in order to reduce error at the output layer.

The graph below shows an example of what a neural network’s error graph may look like. The measure of error is how far off a neural network’s prediction is from the actual value.

An example of the error graph that a neural network may have while training.

Backpropagation use calculus techniques to determine the gradient of the error line at any point in time. As seen in the figure above, a steep gradient indicates that there is still a high error, but a flat line indicates that the neural network is quite accurate. Finding how accurate the neural network is, means that the algorithm can decide how much to change the weights and biases (eg. change them a lot when inaccurate, slight adjustments when very accurate).

Together, these pieces make up a Neural Network.

Conclusion

Today, we saw the concepts of Neural Networks such as perceptron, forward propagation, activation function and backpropagation.

If you like this post, then check out my other posts in this series about

1. What is Machine Learning?

2. What are the Types of Machine Learning?

3. Uni-Variate Linear Regression

4. Multi-Variate Linear Regression

5. Logistic Regression

6. Digit Classifier using Neural Networks

7. Image Compressing with K-means Clustering

8. Dimensionality Reduction on Face using PCA

9. Detect Failing Servers on a Network using Anomaly Detection

Last Thing

If you enjoyed my article, a clap 👏 and a follow would be ✨perceptionatic✨ and it is helpful for medium to promote this article so that others may read it. I am Jagajith and I will catch you in the next one.

What are Neural Networks?

What are Neural Networks?

Why use Neural Networks?

How do Neural Networks work?

The Perceptron

Forward Propagation

Activation functions

Backpropagation

Conclusion

If you like this post, then check out my other posts in this series about

1. What is Machine Learning?

2. What are the Types of Machine Learning?

3. Uni-Variate Linear Regression

4. Multi-Variate Linear Regression

5. Logistic Regression

6. Digit Classifier using Neural Networks

7. Image Compressing with K-means Clustering

8. Dimensionality Reduction on Face using PCA

9. Detect Failing Servers on a Network using Anomaly Detection

Last Thing

Written by Jagajith