Game Of Thrones Episode script generation using LSTM and Recurrent cells in Tensorflow

Ujwal Tewari
Analytics Vidhya
Published in
5 min readAug 7, 2019

--

Game of Thrones season 8 was indeed disappointing and it seemed like the directors were not able to learn the patterns from the previous seasons but worry not, LSTMs never miss a pattern and will help you give a better script, an AI-generated script.

Before diving into the code and scripting of the model and it’s training we will go briefly over what LSTM (Long short term memory) cells and how are they useful.

LSTM Conceptual

An LSTM network is a recurrent neural network that has LSTM cell blocks in place of our regular neural network layers. These cells have different segments called the input gate, the forget gate and the output gate as shown in the image below-

LSTM gates

The image given below shows how the gates operate and the mathematical equations involved at every gate which makes their functions important and executable.

LSTM gate learning process

Implementation and Code

In we will build a character-wise RNN trained on Anna Karenina which will be replaced by cojoined scripts of Game of Thrones Season 3 and 4 once the training is completed and testing is done on the former. After that, it’ll be able to generate new text based on the text from the season scripts.

import time
from collections import namedtuple
import numpy as np
import tensorflow as tf

To begin we load the text file and change it into integers for our network to work on. Here we will be building a few dictionaries to convert the characters to and from integers. Encoding the characters as integers makes it more accessible to use as input in the network to train on.

with open('anna.txt', 'r') as f:
text=f.read()
vocab = sorted(set(text))
vocab_to_int = {c: i for i, c in enumerate(vocab)}
int_to_vocab = dict(enumerate(vocab))
encoded = np.array([vocab_to_int[c] for c in text], dtype=np.int32)

After we are done with this step we then move on to the generation of mini-batches for the training process and that we will code as follows-

Making batches for training-

Get batches functions for generating mini batches

Note —

We want our batches to be multiple sequences of some desired number of sequence steps.

After defining a function which shall generate our mini-batches we will now make our data sets with a batch size of 10 and 50 sequence steps.

batches = get_batches(encoded, 10, 50)
x, y = next(batches)

Now is the part where we will start building the network and for simplifying the process we will break it up into parts so that it is more comfortable to design and understand about each bit. Then, later on, we can combine them up into the entire network.

Creating Inputs

We shall begin by creating input placeholders for the training data and the targets along with a placeholder for dropout layers also.

LSTM cell

Now we shall create the LSTM cell in the hidden layer using RNN as the building block for Recurrent cell to function.

Output

We are almost done and just require to join the output of the RNN cells to a fully connected layer with a softmax output. The softmax output provides us with a probability distribution which we can utilise to foretell the following character.

Loss and Optimizer

Using Cross-Entropy Loss and Adam Optimizer for the learning step.

def build_loss(logits, targets, lstm_size, num_classes):

y_one_hot = tf.one_hot(targets, num_classes)
y_reshaped = tf.reshape(y_one_hot, logits.get_shape())

loss = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_reshaped)
loss = tf.reduce_mean(loss)
return loss
def build_optimizer(loss, learning_rate, grad_clip):

tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(loss, tvars), grad_clip)
train_op = tf.train.AdamOptimizer(learning_rate)
optimizer = train_op.apply_gradients(zip(grads, tvars))

return optimizer

After we have coded all the above material now is the time that we combine all of them for our final piece-

Training the Network

Here is a regular training code, giving inputs and targets into the network, when running the optimizer. Hither we receive the terminal LSTM position for the mini-batch. Next, we pass that position back into the network so the following batch can maintain the position from the earlier batch.

Result-

checkpoint = tf.train.latest_checkpoint('checkpoints')
samp = sample(checkpoint, 10000, lstm_size, len(vocab), prime="Far")
print(samp)

Sample function used here is part of a larger code which you can find in my repository along with the entire notebook and code snippets to execute your own code. Repository-

Game of Thrones NewScript Time

Now that you have learned how to use LSTM to generate a new book just download the subtitles of each episode and clip them together to form a season.

Once you have done that replace anna.text in my repository with your own episode/episodes/season/seasons text file.

Train on a single season and see the results and then, later on, keep adding more seasons to it to refine the dataset and the learning process even more.

Post your results by raising an issue on my repository so that we can see what innovative scripts everyone comes up with and maybe come up with an even better season 8 which has been developed exclusively by AI.

--

--

Ujwal Tewari
Analytics Vidhya

Senior Research Scientist @Games24x7 | Intel AI innovator | Udacity DRL mentor | ML & AI blogger