Mistakes to avoid in your tensorflow model

I have been a complete idiot ( I continue to do so ) and made quite a few mistakes when building a tensorflow model.I thought of documenting the most stupidest and common mistakes here , so that it might help someone.

Not disabling dropout during prediction time :

Dropout is one of the most common regularization technique used in neural networks to avoid a model from overfitting. Tensorflow’s implementation of dropout, consists of a parameter — keep_prob. which controls the probability with which each unit in the neural network is randomly dropped.

The idea behind dropout is that if each units are randomly disabled , the chaos that is introduced will restrict the model from learning only a particular pattern ( a.k.a overfitting ) and will generalise the model.

So, it makes sense only to apply dropout during training time and avoid it during test / prediction time and tensorflow doesn’t do that automatically for us .This can be controlled by having a placeholder to control the keep_prob parameter and the value of the placeholder should be made to 1.o during prediction time.

# training_time
# test_time

If the keep_prob is not updated to 1.0 during prediction time, you will be getting a different predictions each time, for the same set of data.Due to the randomness of the dropout.

Not setting a random seed

If your predictions are not stable for the same input value during different iterations (epochs/batches), it is highly probable that your random values are varying during each iterations. Easiest way to address this is to set a global seed value for all the random values in the graph.

# should be set before any declaration of 
# placeholders/variables/ops.

The seed value can also be set at operation level , tensorflow has a parameter for seed value in all the random operation.

Not naming your placeholders and tensorflow operations

I cannot stress enough on the importance of naming all your placeholders and ops. This not only improves the interpretability of your graph , it also saves a ton of your time in identifying the placeholders and operations to feed values from your frozen/saved model.

So, when you name your placeholders and operations . You don’t have to search the graph to get the default names that the tensorflow has given to the placeholders, you could just use the name to access them.

# Definitionplaceholder = tf.placeholder(dtype=tf.float32,shape[None,300],name="tensor_name")# Retrieving tensor from a frozen model using name.
tensor = graph.get_tensor_by_name('scope_name/tensor_name:0')




Articles on AI and tech. Made in Madras with lots of Kaapi and love.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Logesh Kumar Umapathi

Logesh Kumar Umapathi

I am a Sr. Consultant — ML, NLP @ Saama. Passionate about Natural language processing and Deep Learning . https://www.linkedin.com/in/logeshkumaru

More from Medium

Basics of Recommendation Systems

An Introduction to Collaborative Filtering Recommendation Systems

Teaching computers to learn what recipes are

Custom Transformers- Feature Scaling