Using T-SNE to Visualise how your Model thinks

Harshvardhan Gupta
BuzzRobot
Published in
4 min readDec 4, 2017

Deep Learning has given us a new way to think about things. Partially, the reason is that they can be made arbitrarily big, allowing them to have immense capacity, and they can be regularised, preventing overfitting. However, it is not very easy to understand how these models work. In this post, I will show a simple technique that allows somebody to somewhat see what the model is doing.

How this article is structured

I will go over the individual components, how to use them together, and at the end of the post, I will link to the source code to get started.

This article does not require an in-depth understanding of Neural networks, but some basic info about them may help.

The Neural Black Box

Neural networks operate on vectors, which is a list of real numbers. E.g. A convolutional neural network passes the input through a series of convolutions, then are eventually passed through fully connected layers. At this point , the model has an idea of the high level features of the input , e.g. the ‘eyes’ , ‘dogs’ , ‘faces’ , etc.

Unfortunately, it is not directly possible to interpret these fully connected layers. It is possible to visualise the convolution activations, but its harder to understand whats going on in the dense layers.

Dimensionality reduction & T-SNE

There are usually many hundreds of neurons in the fully connected layers. E.g. the VGG16 architecture contains 4096 neurons after the convolutions. These 4096 values can be considered as features mentioned above. If only there was a way to visualise these numbers!

Dimensionality reduction is a way to to reduce high dimensional features into lower dimensions, while trying to preserve the characteristics of the data. E.g. similar images will be closer together, and dissimilar images will be far away.

Enter T-SNE

I wont go into the mathematical details, but T-SNE is one such dimensionality reduction technique. It tries to keep the structure of the high dimensional data, and also reduce the dimensions. For visualizing these dimensions, we will reduce the dimensions to 3, so that they can be plotted in a 3d plot. You can read about T-SNE here.

Here’s an example of T-SNE. Note that in this case, the pixels were directly used as features , i.e. no neural network was used.

Figure 1.0 (GIF) Mnist Visualisation using T-SNE. Source

Notice how it was able to separate it so well. There is a clear separation between the digits, and similar digits are clustered together.

Visualising Natural images using a Convolutional Neural Network

MNIST is a trivial task in 2017. How can we do the same thing for more ‘real’ images? The answer is Convolutional Neural Networks(CNN). We will take a few thousand images, and pass them through the InceptionV3 convolutional network. We will extract the outputs from just before the dense layers of the network, and use that as our dimensions for visualizing.

The Visualisation Tools

Fortunately for us, there is an excellent tool that is part of Tensorboard. You can play with it at projector.tensorflow.org , and find the source code here.

This is meant to be used as Tensorboard , but in my opinion, tensorboard is too cumbersome, and unless you are already using Tensorflow, it is too much of a hassle to use.

That is why we will use just the standalone version. The authors didnt seem to have written any docs about the standalone version, but it is fairly straightforward and I will guide you through the steps in this article.

I have written a few wrappers to export data from the model. The project details will be included at the end of the article.

Case Study — Visualising Archival Photos

I wanted to see how I can use Deep learning to help with understanding photos of my University archives.

Thus I chose to use InceptionV3 as a feature extractor to extract 2048 length vectors , that are the representation of the image.

Then I plotted those features using T-SNE.

Figure 2.0 T-SNE on InceptionV3
Figure 2.1 Highlighting all images that (Left): look like text documents and (Right): have multiple people in them

You can play around with the visualisation yourself here

Conclusion

This was a short article on how dimensionality reduction techniques can be used alongside Deep learning to get best of both worlds : Complex representations and interpretable results.

If requested, I may go into the technical details of T-SNE in a future blog post.

Resources

  • My project link, if you want to perform these visualisations on your own data.
  • The Demo using my own data, if you just want to play around.

--

--