Teasing out the TensorFlow Graph Mess
How can mathematicians think in 21 dimensions? Easy, they think in N dimensions and make N=21. For those who can’t do that, there’s a way to visualize TensorFlow graphs without getting lost in the TensorBoard automatically generated crap.
First of all, I assume you’re familiar with TensorFlow, Python and Jupyter. Maybe you’re learning some MOOC like Udacity Deep Learning, and you can glimpse the extreme power of TensorFlow, but you cannot actually see it. I’ll teach you how to.
There’s a way to insert a Jupyter cell with an iframe containing a TensorBoard view who draws the graph you are playing with. It was published by Google DeepDream team here. Pretty cool indeed, but.. (there’s always a but) TensorBoard yields crap (internal TensorFlow crap to be fair).
Let’s use it with code out of Udacity course (Assignment 2: SGD). No worries guys, there’s no spoiler of the solutions. Defining the following graph:
Then you can run the embedding TensorBoard visualizer code and get:
But there’s a way to make it useful, you just have to name quite a few things in the code to display a better graph. To do so, use name=’name_string’ parameter in some invocations and group some related code lines under kind of “namespace” declaration in TensorFlow using with tf.variable_scope(“namespace_string”):
This last trick will let graph know about what to draw in nested groups, just what we was waiting for. You can group nodes as you want, but in order to enhance encapsulation I did some code refactor (no functional change at all). Let’s see how named code looks like:
You can run again the embedding TensorBoard visualizer code and get:
If you drill down double clicking in the rounded boxes there appears the magic we were expecting for:
And so on..
Variables has been defined in a separate namespace to be reused in training, validation and testing. They could be considered part of the training process, but they are part of the model bounded context itself.
Training block includes loss calculation and the optimizer because they belong to the bounded context of training process. Taking them in a separate namespace makes it graphically messier as far as logits is part of the training internals.
Softmax normalization in training, validation and testing blocks has been intentionally left out of the scope and named respectively as train_prediction_output, valid_prediction_output and test_prediction_output for legibility shake. Letting them inside the scope, the model would end in Softmax normalization, without any clue this is the output Op node it should be run in the Session to get the actual train result and measure the accuracy.
Notice in Training block detail image the block is considering SGD_model/variables as an input, instead of detailing about weights and biases. In next image, MatMul input detail, you can appreciate those details appears when needed.
And what happened to all those crap extra nodes that were drawn in the first attempt? Well, they are still there, within the internals it creates to handle the loss and the optimization. They look even prettier isolated in their own scope.
Well, it wasn’t that bad, it was just a naming problem. I hope it will help you to develop a better TensorFlow code, or at least to look through it when you get lost.