Types of neural networks: Recurrent Neural Networks

6 min readJan 2, 2024

Building on my previous blog series where I demystified convolutional neural networks, it’s time to explore recurrent neural network architectures and their real-world applications.

What are recurrent neural networks?

RNNs, or Recurrent Neural Networks, are specialized deep learning models created for mastering sequences. We can picture them as neural networks equipped with an inherent memory, enabling them to establish connections between information across different time steps. This is achieved through loops integrated into their structure, which, although may appear peculiar initially, are meticulously designed to avoid any potential confusion. In Simple words, Recurrent Neural Network(RNN) is a type of Neural Network where the output from the previous step is fed as input to the current step.

RNNs vs Feed-forward neural networks.

As I discussed in my previous articles, a feed-forward neural network processes information in a linear manner, moving from the input layer through hidden layers to the output layer without any loops or feedback. Unlike recurrent neural networks, feed-forward networks lack memory and struggle with predicting future events. They only focus on the current input and don’t retain any information about the past, except what they learned during training.

RNNs however work differently, In a Recurrent Neural Network (RNN), information undergoes a cyclical process within a loop. When the network makes a decision, it takes into account not only the current input but also integrates knowledge acquired from past inputs. This ability to consider and incorporate information from previous time steps allows RNNs to exhibit a form of memory, enabling them to better understand and analyze sequences of data.

Architecture

Recurrent Neural Networks (RNNs) are a specific kind of neural network with hidden states, enabling them to use past outputs as inputs. The typical flow of RNNs involves considering the current input along with information from previous steps. This structure allows RNNs to capture dependencies and patterns in sequential data.

For each step, the activation can be expressed as follows:

while the output is written as:

where Wax,Waa,Wya,ba,by are coefficients that are shared temporally and g1,g2 are activation functions.

The above image shows what goes inside a recurrent neural network in each step and how activation works.

Types of RNNs

Different types of architectures are used in recurrent neural networks:

1. One To One:This is the traditional setup where one input corresponds to one output. It’s commonly found in standard neural networks.

2. One To Many: With one input, this architecture generates multiple outputs. Applications include music production, where a single musical input can lead to a variety of musical outputs.

3. Many To One: In this setup, multiple inputs contribute to a single output. Examples include sentiment analysis and emotion identification, where the overall sentiment or emotion is determined by considering a sequence of inputs over time.

4. Many To Many: This configuration allows for various possibilities, such as two inputs leading to three outputs. Many-to-many networks are employed in machine translation systems, like English to French translation, where sequences of inputs and outputs are involved.

Working of a RNN:

In a recurrent neural network, the input layer (x) processes the initial input and passes it to the middle layer (h). The middle layer can have multiple hidden layers, each with its own activation functions, weights, and biases. If the parameters of these hidden layers are independent of the previous layer, meaning there’s no memory in the network, you can use a recurrent neural network (RNN).

An RNN standardizes the activation functions, weights, and biases across different hidden layers. Instead of creating numerous distinct hidden layers, it forms a single hidden layer that loops over itself as needed. This looping mechanism allows the network to process sequential information effectively.
The mathematics involved in the working of a recurrent neural network is almost exactly same as the mathematics of a basic neural network as discussed in my previous articles, however backpropagation is slightly different, if you wish to learn how the backpropagation in a RNN works, you must go through the maths behind the backpropagation of a simple artificial neural network discussed in one of my previous articles.
When using the Backpropagation algorithm in a Recurrent Neural Network (RNN) with time series data, it is termed “backpropagation through time.”

In a standard RNN, one input is processed at a time, resulting in a single output. In contrast, during backpropagation, both the current input and previous inputs are used. This process is known as a timestep, where multiple data points from the time series enter the RNN simultaneously.

After the neural network has been trained on a dataset and produces an output, the next step involves calculating and gathering errors based on this output. Subsequently, the network undergoes a process of backpropagation, during which it is essentially rolled back up. During this backpropagation, the weights within the network are reevaluated and adjusted to correct for any errors or inaccuracies identified during the training process. This iterative cycle of training, error calculation, and weight adjustment helps the neural network improve its performance over time.

Applications

Recurrent Neural Networks (RNNs) play a crucial role in solving diverse challenges associated with sequence data. Sequence data encompasses various types, including audio, text, video, and biological sequences.

Through the utilization of RNN models and working with sequence datasets, one can effectively address a range of issues, such as speech recognition, music generation, automated translations, video action analysis, and the study of genomic and DNA sequences. Essentially, RNNs offer a versatile approach to tackling a broad spectrum of problems involving sequential information.

To sum up, Recurrent Neural Networks (RNNs) present a robust solution for processing sequential data, showcasing their versatility across various domains. Their adeptness in capturing temporal dependencies and patterns renders them invaluable for tasks spanning language processing to genomics. A comprehensive grasp of the architecture and operational principles of RNNs is imperative to effectively harness their capabilities in practical, real-world scenarios. I’ll be discussing more about AI and neural network models in my upcoming articles.