Building Blocks for AI Part 3: Neural Networks

6 min readMar 26, 2024

We’ve witnessed significant advancements in Generative AI recently, starting with ChatGPT and followed by Bard, Microsoft Bing, and other tools that are redefining our interaction with the internet. Generative AI, seen as a game-changer, and other related technologies are enabled by neural networks.

Neural networks simulate human decision-making by emulating how biological neurons interact to evaluate options and recognize patterns. In this post, we will explore how artificial neural networks function and make decisions. We will also examine common neural network types and their applications. By the end of this post, you should have a solid understanding of how to work with neural networks.

What is Artificial Neural Network (ANN)?

Artificial Neural Networks (ANNs) are computational models that mimic the structure and function of the human brain. Like the brain, which processes information through a network of interconnected neurons, ANNs use a similar approach to process data and make decisions.

https://en.wikipedia.org/wiki/Neural_network_(machine_learning)#/media/File:Neuron3.png

What is a Neuron in ANN?

A neuron is a computational unit that mimics the behavior of biological neurons.

https://towardsdatascience.com/whats-the-role-of-weights-and-bias-in-a-neural-network-4cf7e9888a0f

Inputs: A neuron can receive multiple inputs, which can be raw data or output from other neurons.

Weighted Sums: Each input is given a weight. The final value of the expression is the weighted sum which is the sum of each input multiplied by the corresponding weights

Bias: Bias is a value that is added to the weighted sum of the inputs before the result is passed through the activation function. It serves as an adjustable constant that helps the neuron make better decisions by shifting the activation function either to the left or right. This allows the neuron to fit the data more accurately, even if the weighted sum of the inputs is zero or very small.

Activation Function: It serves as a mathematical “gate” that determines whether a neuron should be activated or not, based on the weighted sum of its inputs and bias. Examples of Activation functions are the Sigmoid function, ReLU or Rectified Linear Unit, and tanh or hyperbolic tangent function.

Output: Output of the current neuron after applying the activation function

How Does ANN work?

Now that we understand what a neuron is let's take the next step and understand how the neurons are put together to form a neural network

https://www.geeksforgeeks.org/artificial-neural-networks-and-its-applications/

Neurons in an ANN are organized into layers:

Input Layer: This is where the network receives its input data. Each neuron in this layer represents a feature of the input data.
Hidden Layers: These layers, situated between the input and output layers, perform the bulk of the computation. The complexity and capacity of the network are largely determined by the number and size of these layers.
Output Layer: This layer produces the final output of the network. The number of neurons here typically corresponds to the number of classes in a classification task or the dimensionality of the output in a regression task.

How Do Neural Networks Learn?

The true power of ANNs lies in their ability to learn from data. This learning process involves adjusting the weights of the connections between neurons to minimize the difference between the network’s predictions and the actual target values. This is achieved through a process called backpropagation and an optimization algorithm, typically gradient descent.

Backpropagation

Backpropagation is a method for computing gradients of the loss function concerning the weights of the network. It involves two main steps:

Forward Pass: The input data is passed through the network, layer by layer, to compute the output.
Backward Pass: The gradients of the loss function are computed and propagated back through the network to update the weights.

Gradient descent is an optimization algorithm used to minimize the loss function. It updates the weights in the direction that reduces the loss, with the size of the update determined by a parameter called the learning rate.

For example, imagine a neural network is trying to learn to identify cats in pictures. In the forward pass, it takes a picture and guesses whether it's a cat or not. In the backward pass, it looks at whether its guess was right or wrong and adjusts its connections accordingly. Over time, it gets better at identifying cats by fine-tuning its connections using gradient descent.

Types of Artificial Neural Network?

We see many types of neural networks in play right now. It will be difficult to categorize all neural networks into a few types. So here, let’s try to understand neural networks based on different criteria.

Architecture

Feedforward Neural Networks (FNNs): Information flows in one direction from input to output. Examples include Multi-Layer Perceptrons (MLPs). Use cases involve regression and classification problems.
Recurrent Neural Networks (RNNs): Networks with loops, allowing information to persist. Examples include Long Short-Term Memory (LSTM) networks. Use cases involve speech recognition.
Convolutional Neural Networks (CNNs): Networks that use convolutional layers, particularly effective for spatial data like images.
Deep Belief Networks (DBNs): Composed of multiple layers of stochastic units, often pre-trained as Restricted Boltzmann Machines (RBMs). Use cases involve Acoustic Modeling.

Learning Method

Supervised Learning: Networks trained with labeled data, where the correct output is provided during training. Examples include classification and regression tasks.
Unsupervised Learning: Networks trained with unlabeled data, where the network learns patterns without explicit output labels. Examples include clustering and dimensionality reduction tasks.
Reinforcement Learning: Networks trained to make a sequence of decisions by receiving feedback in the form of rewards or penalties.

Application

Natural Language Processing (NLP): Networks designed for tasks involving text, such as sentiment analysis, translation, and language modeling. Examples include Transformer networks and RNNs.
Computer Vision: Networks designed for tasks involving images or videos, such as object detection, image segmentation, and facial recognition. Examples include CNNs.
Time Series Analysis: Networks designed for sequential data, such as stock prices or sensor data. Examples include RNNs and LSTMs.

By Depth

Shallow Neural Networks: Networks with only one or two hidden layers.
Deep Neural Networks (DNNs): Networks with multiple hidden layers, capable of learning complex representations.

Summary

A neural network consists of layers, each containing multiple neurons. A neuron is the fundamental unit of a neural network, and it operates based on several key concepts: inputs, weights, bias, activation function, and output. In a neuron, input variables are assigned weights, which are then multiplied by the inputs and summed together. This sum is added to a bias value and passed through an activation function to produce the output.

Neural networks are structured with an input layer, one or more hidden layers, and an output layer. Typically, there is a single input and output layer, but the number of hidden layers can vary depending on the specific application. Neural networks learn by comparing the generated output to the actual output and adjusting the weights through a process called backpropagation. This adjustment is aimed at minimizing the difference between the predicted and actual outputs. The learning process continues until the network produces acceptable outputs for the given inputs.

In my previous post, I discussed the concept of configuring an AI chatbot with the capability to serve various…

medium.com

Building Blocks for AI Part 2: Clustering and Classification

In the last post, I talked about some core areas needed for setting up your AI/ML roadmap like vectorization…