Insight into a few basic deep learning algorithms

Aditi Mittal
Nerd For Tech
Published in
5 min readMar 11, 2021

Introduction

Learning can be defined as acquiring knowledge or skills through experience, study, or by being taught. So, Machine Learning can be defined as a phenomenon where a machine can be taught or it learns on its own without being explicitly programmed.

Photo by Luca Bravo on Unsplash

Definition

Wikipedia has defined deep learning as:

Deep learning is a class of machine learning algorithms that uses multiple layers to progressively extract higher level features from the raw input. For example, in image processing, lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits or letters or faces.

In this article, we are going to discuss few deep learning algorithms such as deep belief networks, generative adversarial networks, transformers and graph neural network.

Deep Belief Networks

Before diving deeper into the deep belief networks, let’s first discuss Restricted Boltzmann Machines.

Restricted Boltzmann Machines

Restricted Boltzmann Machines can be considered as a binary version of factor Analysis i.e. we can have output as binary variable (in the form of 0 or 1).

Wikipedia defines RBM as:

A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs. They can be trained in either supervised or unsupervised ways, depending on the task.

For Example: If you have visited a restaurant, and then judge that restaurant on the scale of two: either you like the restaurant or you do not like the restaurant. For these kind of cases, we can use RBMs.

RBMs have three major components:-

  1. Input Layer
  2. Hidden Layer
  3. Bias

In the above example, visible units are whether you like the restaurant or not. Hidden unit helps to find what makes you like that particular restaurant. Bias is an extra component which is added to incorporate different kinds of properties that different restaurants have.

In this kind of network, each visible unit is connected to all the hidden units and the bias unit is connected to all the visible units and all the hidden units.

Let us look at the steps in the decision making process:-

  1. Activation energy computation
  2. Calculating the sigmoid which is an activation function of Activation Energy. This will give us a probability.
  3. Using the above calculated probability, hidden unit can turn on or turn off any of the nodes in visible unit.

Deep Belief Network

Deep belief networks are generative graphical models which are composed of multiple hidden layers. They contain both directed and undirected layers. Each layer in deep belief networks learns the entire input. In convolutional neural networks, the first layers only filter inputs for basic features and the later layers recombine all the simple patterns found by the previous layers. They have two phases:- pre-training phase and fine tuning phase. It contains multiple layers of RBMs, while fine tuning phase is a feed forward neural network. Each sub network’s hidden layer serves as a visible layer for next network. The training is carried out in a greedy layer-wise manner with weight fine-tuning to abstract hierarchical features derived from the raw input data.

They are widely used in image and video recognition. Along with this, they are used in tracking the movement of objects or people.

Generative Adversarial Networks (GAN)

Unsupervised models that summarize the distribution of input variables may be used to create or generate new examples in the input distribution. These types of models are knows as generative models.

Generative Adversarial Networks are becoming popular because of their ability to understand and recreate visual data with a remarkable accuracy. They can be used in filling in images from an outline, generating a realistic image from text, producing photorealistic depictions of product prototypes or converting black and white imagery into colour. GANs have two parts: generator and discriminator

Generator

Generator learns to generate data. The generated instances become negative training examples for the discriminator. The generator model takes a fixed-length random vector as input and generates a sample in the domain. The vector is taken randomly from a Gaussian distribution, and the vector is used to seed the generative process. After training, points in this multidimensional vector space will correspond to points in the problem domain, forming a compressed representation of the data distribution.

Discriminator

Discriminator learns to differentiate the generator’s fake data from real data and works as a basic classification model. The main role of the discriminator is to penalize the generator for producing wrong or fake data.

The desired output needs to be first identified and then the training dataset is gathered based on the identified parameters. This data is then randomized and is fed as input to the generator until it is trained with a reasonable accuracy in producing the real outputs.

After this, the generated output is fed into the discriminator along with the actual datas. The discriminator then filters the information and returns a probability ranging from 0 to 1 to represent authenticity of each image. These probability values are then manually verified for success and the process is repeated until the desired output is reached.

Transformers

Transformer is a widely known deep learning model that is used primarily i natural language processing (NLP). It is designed to handle sequential data but doesn’t require the data to be processed in order. Therefore, this model allows more parallelization than recurrent neural networks and thus, reduced training times. It allowed the training on larger datasets than it was possible before transformer was introduced.

Architecture

It is an encoder-decoder architecture.

Encoder consists of a set of encoding layers that processes the input iteratively through each layer, containing information about which parts of inputs are relevant to each other.

Decoder consists of a set of decoding layers that processes on the outputs of the encoder using their incorporated contextual information to generate an output sequence.

For every input, each layer weighs the relevance of every input and draws information from them accordingly to produce the output. Each layer also have a feed forward neural network for additional processing of the outputs.

BERT ad XLNets are the most prominent pre-trained natural language systems which is used in a variety of NLP tasks, and it is based on transformers.

Graph Neural Networks

Graph neural networks is another type of the neural network which is based on unstructured data structure called graph. Every node in the graph is assigned a tag, and then we want to predict the tag of the nodes without ground-truth. They are extensively being used in real-world problems that can be represented as a graph such as social networks, chemical compounds, maps and transportation systems.

GNN identify the relationships between all the nodes in a graph and produce a special representation of the same so that it can later be used in any other ML models like clustering, classification, etc.

The neighbouring nodes pass their messages through the edge neural networks into the recurrent unit on the reference node. The new embedding of the reference recurrent unit is updated by applying recurrent function on the current embedding and a summation of the edge neural network outputs of the neighbouring node embeddings.

Thank you for reading this article! Please feel free to leave your feedback.

--

--