Basic Recurrent Neural Network Tutorial — 2

Ting-Hao Chen
Machine Learning Notes
3 min readJan 8, 2018

We are going to use tf.nn.rnn_cell.BasicRNNCell + tf.nn.static_rnn to build a simple RNN.

If you are interested, the code (jupyter notebook and python file) of this post can be found here.

credit: https://blogs.biomedcentral.com/on-biology/2015/10/09/aging-brain-editor-discussion/

Build Basic RNN Cell with static_rnn

We have three data to input per step (X0, X1, X2), therefore in placeholder we introduce a new dimension ‘n_steps’ to do that. BasicRNNCell will handle the matrix multiplication and use tanh() as default.

Previous input sequence of data
# input data
X_data = np.array([
# steps 1st 2nd 3rd
[[1, 2], [7, 8], [13, 14]], # first batch
[[3, 4], [9, 10], [15, 16]], # second batch
[[5, 6], [11, 12], [17, 18]] # third batch
]) # shape: [batch_size, n_steps, n_inputs]

The input of static_rnn must be shape of [batch_size, n_inputs], so we need to unstack our X. Unstack along axis=1 will pull n_steps out, and the meaning here will become in i’s step, the output of the shape is [batch_size, n_inputs] just like the previous example. Below is quoted from TensorFlow’s description:

“Given a tensor of shape (A, B, C, D). If axis == 1, then the i’th tensor in output is the slice value[ : , i , : , : ] and each tensor in output will have shape (A, C, D).”

After static_rnn, we’ll have to use tf.stack.

# hyperparameters
n_neurons = 8

# parameters
n_inputs = X_data.shape[2]
n_steps = X_data.shape[1]

# rnn model
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
X_seq = tf.unstack(X, axis=1) # shape: [batch_size, i, n_inputs], total num of i = n_steps

cell = tf.nn.rnn_cell.BasicRNNCell(num_units=n_neurons)
output, state = tf.nn.static_rnn(cell, X_seq, dtype=tf.float32)

output_st = tf.stack(output, axis=1)

Since tf.nn.static_rnn will have two results, let’s see the difference.

  • output: the total output

In this case, the output will look like [X0, X1, X2] and the shape of X0, X1, X2 is [batch_size, n_neurons]

  • state: only output the last one

In this case, the output is [X2] with shape [batch_size, n_neurons].

Finally, let’s train the model!

# initialize the variables
init = tf.global_variables_initializer()

# train
with tf.Session() as sess:
sess.run(init)
feed_dict = {X: X_data}

# print the shape
X_seq_shape = sess.run(tf.shape(X_seq), feed_dict=feed_dict)
output_shape = sess.run(tf.shape(output), feed_dict=feed_dict)
state_shape = sess.run(tf.shape(state), feed_dict=feed_dict)
output_st_shape = sess.run(tf.shape(output_st), feed_dict=feed_dict)
print('X_seq shape [batch_size, n_steps, n_inputs]: ', X_seq_shape)
print('output shape [batch_size, n_neurons]: ', output_shape)
print('state shape [batch_size, n_neurons]: ', state_shape)
print('output_st shape [batch_size, n_steps, n_neurons]: ', output_st_shape)

output_eval, state_eval = sess.run([output, state], feed_dict=feed_dict)
print('Is the output of X2 equals to the state?', np.array_equal(output_eval[2], state_eval))

Here is the output:

X_seq shape [batch_size, n_steps, n_inputs]:  [3 3 2]output shape [batch_size, n_neurons]:  [3 3 8]state shape [batch_size, n_neurons]:  [3 8]output_st shape [batch_size, n_steps, n_neurons]:  [3 3 8]Is the output of X2 equals to the state? True

Since we’ll have to unstack and stack the data, we could use dynamic_rnn to avoid that. So next I’ll show you how to do that.

--

--