Artificial Neural Network (ANN) for Dummies: A Simple and Fun Introduction to Artificial Neural Networks

Rahul Kumar
7 min readMay 29, 2023

--

Graphical representation Artificial Neural Network (ANN)

Artificial Neural Network (ANN) or Artificial Neural Network (JST) is a set of algorithms that work like the neural network of the human brain, where neurons are connected to each other, working to process information. The main purpose of ANN is to make computers have cognitive abilities like the human brain, have problem solving abilities and can carry out the learning process.

What does a Neural Network actually do?

Figure 1. Neural Network Representation — cs231n stanford (Lecture 4–92) [link]

In Figure 1, there are three inputs into the neuron (x0, x1, x2), each input multiplied first by a variable called ‘weight’ (w0, w1, w2), after which the three are added together. Each neuron connection has its own ‘weight’ and its value will change during the learning process until the model produced by ANN approaches the desired output target.

After that, bias b can be added to the sum result above. This bias value does not come from the input layer. Biases such as intercepts in linear equations, are added to make the calculation results more accurate.

After all the sums are done, the neurons will be fed into a function called the Activation Function. The Activation Function governs whether the neuron should be active or not.

Figure 2. Identity Activation Function f(x) = x [link]

Well, that’s how ANN works, all neurons are connected to each other in producing output from a given input.

Artificial Neural Network (ANN) Architecture

ANN is a group of neurons organized in layers, including: input layer: the layer that brings data into the system to be processed at the next layer.

- hidden layer: the layer between the input layer and the output layer, where the artificial neuron has a set of ‘weight’ weighted inputs and a procedure for generating output neurons through an activation function

-output layer: the last layer of neurons that produces system output.

Figure 3. Neural Network Architecture — cs231n stanford (Lecture 4–97) [link]

The following are the types of Neural Networks and their Architectures:

  1. Feed Forward Neural Network

Feed Forward Neural Network is the simplest Neural Network. Information enters from the input layer to the hidden layer until the output layer is in one ‘forward’ direction, not carrying out a cycle/loop process as in a Recurrent Neural Network.

  1. a. Single Layer Perceptron

The simplest Neural Network is the single layer perceptron, which has only a single layer output node. Input data goes directly to the output neurons through a series of ‘weights. the sum for all the dot multiplication between the input variable and the ‘weight’ in each neuron connection. Perceptron's can be trained in a simple way by utilizing the delta rule, which will calculate the error generated by the output layer against the actual value which is then used to correct the ‘weight’. Single layer perceptron can only be used for problems that are linearly separable, meaning that the output boundaries are clearly defined and only have two possibilities.

2.b. Multi Layer Perceptron

Multi Layer Perceptron involves many layers that are interconnected in a feed forward manner, where each neuron in a layer is connected to all neurons in the next layer. Many Multi Layer Perceptron implementations use the Sigmoid function as their Activation Function. Multi Layer Perceptron uses Backpropagation in the training process which will calculate the gradient of the loss function associated with the ‘weight’ of each neuron connection and will minimize loss when the ‘weight’ is changed.

Two layer neural networks can be used to calculate XOR result

2. Radial Basis Function (RBF)

Radial Basis Function Network uses Radial Basis Function as its Activation Function. RBF is able to solve problems of function approximation, time series prediction, classification, and system control.

The input value x is used for all Radial Basis Function inputs in the hidden layer. And the network output is a linear combination of all RBF outputs in the hidden layer.

Radial base funtion network architecture

RBF Activation Function is formulated as follows,

where x is the input value, a is the weighter, N is the number of neurons in the hidden layer and c is the center vector of neurons.

3. Convolutional Neural Network (CNN)

Convolutional Neural Network (CNN) in Deep Learning is classified as a deep neural network that is widely used in image analysis (visual imagery). CNN is a regularization version of multi-layer perceptron's and is built by many convolution layers with activation functions mostly ReLU.

CNN Arhitecture (MLP — Fully connected layer)

Convolutional Layer

The Convolutional Layer is the main key to CNN. The CNN input layer is a set of filters/kernels in the form of a small two-dimensional matrix. At the time of forward pass, each kernel goes through the input matrix (image) to produce a 2-dimensional activation map for the kernel. As a result, the network learns which filter/kernel is active when it detects certain types of features at spatial positions in the input (image).

Demo convolution layer-[CS231n Convolutional Neural Networks for Visual Recognition]

In addition to the types of neural networks above, there are many others such as Recurrent Neural Network (RNN), Generative Adversarial Network (GAN), etc.

Jenis-Jenis Activation Function

  1. Identity

Identity Function is a function that will return the same value as the input provided, where f(x) is the Identity Function, for all values x (-∞, ∞) apply,

f(x) = x

The Identity function curve is described as follows,

Identity curve function, f(x) = x, for all x values

2. Binary Step

Binary Step Function is a function that will return the value 1 if input x ≥ 0, and will return the value 0 for input x < 0, so f(x) will only have an output value of 0 or 1.

f(x) = 0 , if x < 0

f(x) = 1, if x ≥ 0

The Binary Step function curve is described as follows,

Binary Step Function Curve

3. Sigmoid (logistic function)

A sigmoid function, also known as a logistic function, will produce values in the range [0, 1], formulated as,

At first the sigmoid function grows exponentially, after which saturation begins to occur and growth slows down so that it becomes linear until growth stops. The Sigmoid function curve is described as follows,

Sigmoid Curve Function

4. Rectified Linear Unit (ReLU)

The Rectified Linear Unit (ReLU) function has the advantage that in randomly initiated networks, only 50% of the hidden layer will be activated. The ReLU Function is formulated as follows,

f(x) = max(0, x)

or in sesepengal form,

The curve of the ReLU function is described as follows,

ReLU Function curve

5. Leaky Rectified Linear Unit (Leaky ReLU)

Leaky ReLU Function is a ReLU that allows a slight positive gradient value when the unit is inactive. Leaky ReLU is formulated as follows,

For x ≤ 0 values, Leaky ReLU will produce 1% x value (0.01x)

The Leaky ReLU curve is described as follows,

ReLU Leaky Curve

6. Softmax

Softmax Function or also called softmax regression is a form of logistic regression whose input value is normalized into probability distribution, formulated as,

for i = 1, …, J

Well, that’s it for this article, I will try hands on Neural Network single perceptron for binaryclass problem cases next!!

Thanks for reading! hope you like it. Fell free to share your thoughts and to comment below. You can connect me on LinkedIn:-linkedin.com/in/rahul-kumar-64270b210. Am more than happy to talk to you :)

--

--

Rahul Kumar

Hi, Fella's! hope you are doing great ;) welcome to my page, I am ML/DL practioner and I post blog about data science, finance and investment ;) Enjoy