Basic Recurrent Neural Network Tutorial — 1

Ting-Hao Chen
Machine Learning Notes
3 min readJan 8, 2018

Let’s get our hands dirty and play some simple RNN with TensorFlow.

If you are interested, the code (jupyter notebook and python file) of this post can be found here.

credit: https://blogs.biomedcentral.com/on-biology/2015/10/09/aging-brain-editor-discussion/

Simple sequence to sequence RNN

Let’s play with simple sequence to sequence RNN without using TensorFlow built-in API. We have a sequence of data, named X0, X1, X2. Each of the data has 2 input data. There are three batches of data ready to run.

Input sequence of data

The i ’s batch of sequence of data goes into the first hidden layer (only one hidden layer here). The shape is [i, n_inputs]. For example, the first batch (i = 0) of data is [1 2] [7 8] [13 14]. Each cell contains 8 neurons, and the shape of the cell is [n_inputs, n_neurons]

The shape of i ’s batch of y0, y1, y2 is [i, n_neurons]

In order to output a single value, we add a dense layer. Therefore the shape of i ’s output is [i, 1]

Simple RNN flow

Next, let’s build a RNN model or in other terms, let’s draw a graph with TensorFlow.

# hyparameters
n_neurons = 8

# parameters
n_inputs = 2

# build a sequence to sequence rnn model
X0 = tf.placeholder(tf.float32, [None, n_inputs]) # shape = [batch_size, n_inputs]
X1 = tf.placeholder(tf.float32, [None, n_inputs])
X2 = tf.placeholder(tf.float32, [None, n_inputs])

Wx = tf.Variable(tf.random_normal([n_inputs, n_neurons]))
b = tf.Variable(tf.zeros([1, n_neurons]))
Wy = tf.Variable(tf.random_normal([n_neurons, n_neurons]))

y0 = tf.tanh(tf.matmul(X0, Wx) + b) # shape: [batch_size, n_neurons]
y1 = tf.tanh(tf.matmul(y0, Wy) + tf.matmul(X1, Wx) + b)
y2 = tf.tanh(tf.matmul(y1, Wy) + tf.matmul(X2, Wx) + b)

output0 = tf.layers.dense(y0, 1) # shape: [batch_size, 1]
output1 = tf.layers.dense(y1, 1)
output2 = tf.layers.dense(y2, 1)

Great! Go and train our model!

# initialize the variables
init = tf.global_variables_initializer()

# train
with tf.Session() as sess:
sess.run(init)
output0_eval = sess.run(output0, feed_dict={X0: X0_data, X1: X1_data, X2:X2_data})
output1_eval = sess.run(output1, feed_dict={X0: X0_data, X1: X1_data, X2:X2_data})
output2_eval = sess.run(output2, feed_dict={X0: X0_data, X1: X1_data, X2:X2_data})
print('input0: {} input1: {} input2: {} -> output0: {} output1: {} output2: {}'.format(
X0_data[0], X1_data[0], X2_data[0], output0_eval[0], output1_eval[0], output2_eval[0]))
print('input0: {} input1: {} input2: {} -> output0: {} output1: {} output2: {}'.format(
X0_data[1], X1_data[1], X2_data[1], output0_eval[1], output1_eval[1], output2_eval[1]))
print('input0: {} input1: {} input2: {} -> output0: {} output1: {} output2: {}'.format(
X0_data[2], X1_data[2], X2_data[2], output0_eval[2], output1_eval[2], output2_eval[2]))

The output looks like this.

input0: [1 2] input1: [7 8] input2: [13 14] -> 
output0: [-0.32207465] output1: [-0.51884687] output2: [-1.17626405]
input0: [3 4] input1: [ 9 10] input2: [15 16] ->
output0: [-1.07676077] output1: [-0.48756164] output2: [-1.17693329]
input0: [5 6] input1: [11 12] input2: [17 18] ->
output0: [-1.12765169] output1: [-0.48506528] output2: [-1.17749369]

By playing with this, you should get a good intuition of what sequence to sequence recurrent neural network is doing. Next, we’ll use TensorFlow API to implement the same scenario.

--

--