Making your own simpson’s Tv script

Harshit Awasthi
5 min readApr 3, 2018

--

Have you ever thought about creating your own tv script? But you lack the right skill :( . No worry. I’ve got you covered. In this blog post I’ll show you how to generate your own tv script!! Exciting isn’t it. Get yourself a coke with some popcorns and let’s jump right in!

We are going to use lstm to generate our own Simpson’s TV script. You just need a laptop, internet connection, your own GPU or access to one using AWS or Floydhub, and some knowledge of python and deep learning. I’ll be also using tensorflow as framework for deep learning.

Dataset Preparation

We are going to use Simpsons TV script dataset to generate our own script of a scene at Moe’s Tavern. We are going to use a sample of simpsons dataset from 27 seasons. You can obtain the dataset from below.

Preprocessing

Now before doing anything with your data you first need to clean it. And for cleaning your dataset your first need to see what your dataset contains. Like how many scenes are there, how many sentences in each scene, total number of unique words etc.

You first create a vocabulary that contains unique words covering all possible words in your dataset. Since you can’t just input words directly into your neural network you will need a dictionary to convert word into id’s(numbers) and another dictionary to obtain words back from id’s.

def create_lookup_tables(text):
vocab = set(text)
vocab_to_int = {word: i for i, word in enumerate(vocab)}
int_to_vocab = dict(enumerate(vocab))
return vocab_to_int, int_to_vocab

Now comes the cleaning part. Since characters such as punctuations cannot be called words exactlyand they might create problem for our neural network to distinguish words like Harry and Harry’s (name with an apostrophe) so we assign them unique values. For instance:

process  = {'.': '||period||', 
',': '||comma||',
'"': '||quotation_mark||',
';': '||semicolon||',
'!': '||exclamation_mark||',
'?': '||question_mark||',
'(': '||left_parentheses',
')': '||right_parentheses',
'--': '||dash||',
'\n': '
return process

Creating placeholders

You need to create placeholders for input , targets, and your learning rate because you want to feed these into your training process. You can do so by using tensorflow’s tf.placeholder function.

def get_inputs():
input_ = tf.placeholder(tf.int32, [None, None], name = 'input')
targets_ = tf.placeholder(tf.int32, [None, None], name = 'targets')
learning_rate_ = tf.placeholder(tf.float32, shape = (), name = 'learning_rate')
return input_, targets_, learning_rate_

Creating RNN cell

Now we will stack one or more BasicLstmCell in MultiRNNCell depending on your choice. It is generally recommended to stack 2 to 4 lstm cells. Generally, more the number of layers in lstm cell means better chances of handling complex data.

layers = 2
cells = []
for i in range(layers):
lstm = tf.contrib.rnn.BasicLSTMCell(rnn_size)
drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=0.7)
cells.append(drop)

cell = tf.contrib.rnn.MultiRNNCell(cells)
initial_state = tf.identity(cell.zero_state(batch_size, tf.float32), name="initial_state")

return cell, initial_state

I’ll explain a bit what the code presented above does. So we define total number of cells as layers to be 2. We use tf.contrib.rnn. BasicLSTMCell function to stack cells in MultiRNNCell function. We can add a dropout layer to maintain uniformity of weights as it helps in reducing overfitting when we update the weights by backpropogation.

Creating Embedding

Embedding is basically mapping of words to vectors or tensors of real numbers with relatively less dimensions as compared to vocabulary size. Vocabulary size is total number of unique words in your dataset. By doing so we make our preprocessing step relatively computationaly efficient. A simple way to apply embedding to input data is given below.

tf.contrib.layers.embed_sequence(input_data, vocab_size, embed_dim)

Building RNN cell and Neural Network

Now we build our RNN cell using ‘tf.nn.dynamic_rnn function. We can also use tf.nn.rnn but it is quite slow and it accepts only specified number of steps.

def build_rnn(cell, inputs):
outputs, state = tf.nn.dynamic_rnn(cell, inputs, dtype= tf.float32)
final_state = tf.identity(state, name = 'final_state')
return outputs, final_state

Finally we need to build our neural network using the functions created above. We will apply embedding to get embedded inputs, then use build rnn cell function created above to get our ouputs and state. We then define our weights and biases and initialize them using truncated normal distribution and by zeros respectively. We then add a dense layer with linear activation or simply no activation with size of output layer equal to vocabulary size.

def build_nn(cell, rnn_size, input_data, vocab_size, embed_dim):
embed_data = get_embed(input_data, vocab_size, embed_dim)
outputs, state = build_rnn(cell = cell, inputs = embed_data)
weights_init = tf.truncated_normal_initializer(stddev=0.1)
bias_init = tf.zeros_initializer()
logits = tf.contrib.layers.fully_connected(outputs, num_outputs = vocab_size, activation_fn = None, weights_initializer = weights_init, biases_initializer = bias_init)
print(logits)
return logits, state

Creating Batches

Here comes a bit tricky part. We have our inputs in form of embedding data, but what about the labels or targets? If we want to generate new tv script we need to have some target to train our model on by minimizing the loss between actual output(targets) and predicted output.

So we assign labels as the next embedded input present in the embedded sequence.

Suppose we have a sentence as : The Starry night. So for instance:

Input sequence: [The, Starry, night]

Labels sequence: [Starry, night, The]

def get_batches(int_text, batch_size, seq_length):    char_per_batch = batch_size * seq_length
num_batch = len(int_text)//char_per_batch

input_text = np.array(int_text[:(num_batch*char_per_batch)])
target_text = np.array(int_text[1:(num_batch * char_per_batch) + 1])

target_text[-1] = input_text[0]

x_batches = np.split(input_text.reshape(batch_size, -1), num_batch, 1)
y_batches = np.split(target_text.reshape(batch_size, -1), num_batch, 1)

batches = np.array(list(zip(x_batches, y_batches)))

return batches

Training

Now that we have our inputs and outputs with us, we are ready to train our model. We also define our hyperparameters.

# Number of Epochs
num_epochs = 100
# Batch Size
batch_size = 512
# RNN Size
rnn_size = 512
# Embedding Dimension Size
embed_dim = 256
# Sequence Length
seq_length = 15
# Learning Rate
learning_rate = 0.009
# Show stats for every n number of batches
show_every_n_batches = 10

So we then save our model and define a function to predict the next word correctly.

def pick_word(probabilities, int_to_vocab):
return np.random.choice(list(int_to_vocab.values()),p=probabilities)

Now we just have to write a few line of code to generate our completely new tv script.

Here is an example of generated script.

moe_szyslak: uh, hey, how ya doin'?
homer_simpson: i was just tellin' all the bad news to not fail.
homer_simpson: yeah, me, you'd treat her right.(regretful) as a little one, i think of my treasure.
moe_szyslak: oh guys, it was horrible.
moe_szyslak: ya bunch of ungrateful ingrates! ya--
carl_carlson: you got this?
moe_szyslak: no, no, no. not a little girl will be in the air.
moe_szyslak: yeah, you don't even have a beer?
homer_simpson:(sunk) i dunno.


moe_szyslak: sure.
homer_simpson: number a mean, or a deal. i didn't mean that.(to home) there's a thing i call my man. i am not an angel!
moe_szyslak: well, i guess the world's smallest violin. and you can just waltz off the
homer_simpson: to a...


moe_szyslak:(cutting him off) too late, or is it.
homer_simpson: moe, i could forget ya.

Congratulations! You’ve just created your own TV script !!

You can check out the entire code for script generation here.

I’m really excited to know what scripts are your going to create, Director! Cheers!

--

--

Harshit Awasthi

Software Engineer | Passionate about Web3, AI and Software engineering