Character-level Deep Language Model with GRU/LSTM units using TensorFlow

Published in

Nabla Squared

13 min readJul 20, 2021

In this article, I’m going to show how to implement GRU and LSTM units and how to build deeper RNNs using TensorFlow. I will start by explaining a little theory about GRUs, LSTMs and Deep RNNs, and then explain the code snippet by snippet. This article is meant to be a continuation to my previous article about RNNs:

Creating a simple RNN from scratch with TensorFlow

And using it to build a language model for news headlines

medium.com

I’ll suggest you to read that article first. But if you already know about simple RNNs and just want to learn about the content of this article, that’s fine too.

A little theory

The simple type of RNN that was used in my previous article has some issues. To see that, let’s recall the equations of that model:

Here, at is the vector that gets passed among consecutive time steps.

The problem is that, at each time step, when a new version of this vector is computed, it is…

Character-level Deep Language Model with GRU/LSTM units using TensorFlow

Creating a simple RNN from scratch with TensorFlow

And using it to build a language model for news headlines

A little theory

Written by Dorian Lazar