TensorBoard for Beginners!

Kevin Koehncke
7 min readMay 15, 2018

--

Introduction

Like many just starting to try and get a grasp on this concept called Machine Learning, I too have been lost in the weeds too many times to count when trying to learn certain technologies associated with machine learning. Wrapping your head around things like gradient descent or LSTMs can be a headache in it of itself. Coupled with the fact of trying to remember how to split up your training and test data into batches or what that command in Scikit-Learn is can lead to huddling in a corner repeatedly saying “Everything will be alright Kevin” ….

While you may be familiar with TensorFlow, there is also TensorBoard, a built-in model visualizer and monitor that allows you to hone in on issues with your model.

After gaining interest in the capabilities of TensorBoard, I wanted a simple but useful demo of how to use TensorBoard but couldn’t find one so I decided to build one.

In this article, I will setup a sample problem, implement a simple model in TensorFlow with TensorBoard, and explain the details that YOU need to know in order to start using TensorBoard.

Note that I will assume familiarity with machine learning and Python (If you are looking for a light non-technical introduction to machine learning, I highly recommend: Machine Learning is Fun).

To skip to the full notebook: Click Here

Background

TensorFlow is Google’s programming framework that utilizes graphs (Don’t think xy​ planes but G=(V,E)​) in order to perform efficient computations.

In 2017, Google showcased TensorBoard at the TensorFlow Summit, showing its monitoring capabilities when training models and how it can help vastly when iterating through model selections and debugging convergence issues.

TensorFlow works in the following fashion:

  1. You build your computational graph by adding ‘nodes’
  2. You execute your graph by feeding the input some data

Let’s look at a quick example to showcase the basics before we get into the task at hand.

Suppose you want to write a TensorFlow program to compute the addition of two variables.

Firstly, we define two variable, give them a name (we could name them Bob and Bill if we wanted to, it doesn’t matter) and compute the sum:

So now you can just press Enter on the keyboard and the sum should equal 7. Right?

No, instead we get this:

Remember how I said you build your graph then execute it? All we have done thus far is build our graph!

TensorFlow is smart enough to know how to connect our nodes based on what we want to compute. Let’s see what our graph looks like:

You might notice the small dotted circles saying init. This is saying that these nodes are initialize-able variables. In order for our node add to have a value, we need to first initialize our variables. We do this via a TensorFlow session:

So where does the power lie in TensorFlow? Since we have a graph, Tensorflow automatically determines the set of nodes needed for a given computation and evaluates those nodes first. This makes computing quantities like the gradient via backpropagation in a neural network computationally efficient.

Machine Learning with TensorFlow

With the basics of TensorFlow understood, let’s apply what we have learned to a simple machine learning problem.

Suppose you are a real estate agent in King County, Washington and want to know how the housing prices will be next month. You have access to a years worth of housing price data.

We will use a linear regression model to learn off our dataset and predict housing prices.

We will build a neural network with the following architecture:

  • Input Layer
  • Two Hidden Layers, with a Dropout Layer sandwiched between them
  • Hidden Layer 1 will have 100 units using a tanh activation function
  • Hidden Layer 2 will have 50 units using a ReLu​ activation function
  • Output Layer using a MSE loss function

First we need to construct our neural network before we do anything:

Input Layer

Our input layer is the beginning of our neural network and takes in our input (X) and our desired output (y​).

In our case, X​ has n=18​ features and we want only n_output=1​ outputs since we are performing regression and just want a singular number back:

Instead of defining TensorFlow variables, we define TensorFlow placeholders with the shape of our input data so that during training, we can feed in our batched training examples into those placeholders (hence the name). Note: we do not define our first dimension’s size as this corresponds to our batch size which we can vary and it is convention to not hard-code in the batch size.

(As an aside, we define our dropout probability as a placeholder so we can change the probability at the time of training.)

We define our input layer under this thing called a name scope which tells TensorBoard to group our three children nodes together under one parent node called input at the time of writing your graph’s data to TensorBoard. We then see the following in TensorBoard:

Hidden & Output Layers

Our two hidden layers and output layer we can define quite easily using TensorFlow’s layers API which creates a hidden layer in just one line:

After experimenting with using a name scope or not, it is clearer visually to not define a name scope when using the layers API and to just add names to each layer as each layer already encompasses the various components such as the bias unit, our weight matrix and our activation function as shown here for our first hidden layer:

As you would expect, we now see our different hidden layers in TensorBoard:

Computing Our Loss & Training Our Model

We now feed our output layer into our MSE loss function in order to compare our predicted housing prices (​output) to the actual housing prices (​y):

Now comes the power of TensorBoard. During the training of our model, we are interested in monitoring how our loss functions changes over each iteration to make sure that our model is converging to the global minima. Instead of printing the loss at each iteration, we can add a tf.summary.scalar op to our MSE node and then monitor how our loss changes via TensorBoard under the Scalars tab like so:

In our case, our loss function converges which is great! But in some cases, if our loss function was constant or diverging, we would see that and be able to adjust our model accordingly.

With our calculated MSE value, we proceed to train our model via an Adam optimizer:

Say that during the training of our model, we want to monitor the distributions of our weights or bias units. We can attach a tf.summary.histogram op to our specified output(s) for monitoring. For our example, we want to monitor all trainable variables in our neural network. We can do this easily by calling tf.trainable_variables:

Now we can monitor our trainable variables during training via the Histograms or Distributions tab:

Putting it All Together

Up to this point, I have been explaining how we can use TensorBoard to monitor different quantities but how do we actually write stuff to TensorBoard?

To do this, we need to utilize TensorFlow’s FileWriter class which allows us to write our summaries and events to a specified directory. We instantiate a FileWriter by specifiying the file path to our log directory and a graph for the graph we want to visualize:

Next, the summary nodes we have attached to each of our nodes of interest have to be run in order to generate our summary data. We could manually run each summary node, but luckily TensorFlow allows use to merge all our summary ops into one op that we can run while training:

Once you have trained your model, you can view TensorBoard by using the following command in your terminal and then going to localhost:6006:

Conclusion

With no feature engineering and some hyperparameter tinkering, we were able to get a R²​ value of ​R²=0.86 with our model, awesome! If you were a real estate agent in Kings County, you would pay big bucks to have access to this model.

So what have we accomplished? We built a simple neural network to predict housing prices in TensorFlow and utilized TensorBoard to visually monitor our loss function, trainable variables, and graph architecture. Instead of writing print statements to debug issues, we can use TensorBoard to quickly see what the issue is and make adjustments to our model.

With this knowledge, you can apply TensorBoard to any other types of models to ensure that YOU are optimizing your time when performing model iteration.

--

--