Using the Dropout API in TensorFlow (6/7)
In the previous part we built a multi-layered LSTM RNN. In this post we will make it less prone to overfitting (called regularizing) by adding a something called dropout. It’s a weird trick to randomly turn off activations of neurons during training, and was pioneered by Geoffrey Hinton among others, you can read their initial article here.
Fortunately this is very simple to do in TensorFlow, between the lines 41–42 you simply add a
DropoutWrapper with the probability to not drop out, called
output_keep_prob. Change lines 41–42 to the code below.
Don’t drop out too much or you will need a large state to be sure to keep some of the information (in our toy example at least). As you can read in this article dropout is implemented between RNN layers in TensorFlow, not on recurrent connections.
This is the whole self-contained script, just copy and run.
In the next part we will further regularize it by using something called batch normalization. Stay tuned, it will be coming soon :)