Mistakes to avoid in your tensorflow model
I have been a complete idiot ( I continue to do so ) and made quite a few mistakes when building a tensorflow model.I thought of documenting the most stupidest and common mistakes here , so that it might help someone.
Not disabling dropout during prediction time :
Dropout is one of the most common regularization technique used in neural networks to avoid a model from overfitting. Tensorflow’s implementation of dropout, consists of a parameter — keep_prob. which controls the probability with which each unit in the neural network is randomly dropped.
The idea behind dropout is that if each units are randomly disabled , the chaos that is introduced will restrict the model from learning only a particular pattern ( a.k.a overfitting ) and will generalise the model.
So, it makes sense only to apply dropout during training time and avoid it during test / prediction time and tensorflow doesn’t do that automatically for us .This can be controlled by having a placeholder to control the keep_prob parameter and the value of the placeholder should be made to 1.o during prediction time.
If the keep_prob is not updated to 1.0 during prediction time, you will be getting a different predictions each time, for the same set of data.Due to the randomness of the dropout.
Not setting a random seed
If your predictions are not stable for the same input value during different iterations (epochs/batches), it is highly probable that your random values are varying during each iterations. Easiest way to address this is to set a global seed value for all the random values in the graph.
# should be set before any declaration of
The seed value can also be set at operation level , tensorflow has a parameter for seed value in all the random operation.
Not naming your placeholders and tensorflow operations
I cannot stress enough on the importance of naming all your placeholders and ops. This not only improves the interpretability of your graph , it also saves a ton of your time in identifying the placeholders and operations to feed values from your frozen/saved model.
So, when you name your placeholders and operations . You don’t have to search the graph to get the default names that the tensorflow has given to the placeholders, you could just use the name to access them.
# Definitionplaceholder = tf.placeholder(dtype=tf.float32,shape[None,300],name="tensor_name")# Retrieving tensor from a frozen model using name.
tensor = graph.get_tensor_by_name('scope_name/tensor_name:0')