MXBoard — MXNet Data Visualization
Author: Wu Jun, Amazon AI Software Engineer
Translated from: https://zh.mxnet.io/blog/mxboard
Preface
Deep neural networks are notoriously difficult to design and train. It usually involves a large number of tweaking and adjustments, modifying the network structure, and trying various optimization algorithms and hyper-parameters. From a theoretical perspective, the mathematical foundations of deep neural networks architectures remain largely incomplete and techniques are often based on generalization of empirical results.
Data visualizations, thanks to their intrinsic visual nature, can partially compensate the above deficiencies and paint a higher level picture to guide researchers during training of deep neural networks. For example, if the gradient’s data distribution can be drawn in real time during model training, the phenomenon of vanishing gradients or exploding gradients can be quickly detected and corrected.
Another example, being able to visualize word embeddings help to clearly see that words are aggregated into different manifolds in a lower dimensional space that maintains contextual proximity. Another useful visualization is data clustering: projecting high-dimensional data into a lower-dimensional space using for example the T-SNE algorithm. There are a large amount of data visualization that can be used in the context of deep learning to help understand better the training process and the data itself.
The emergence of TensorBoard has brought powerful visualizations to TensorFlow ‘s users. We have had feedback from many different users, including corporate ones, that they started using TensorFlow because of the rich feature set offered in TensorBoard. Can this powerful tool be made available to other deep learning frameworks? Thanks to the TeamHG-Memex efforts and their tensorboard_logger, we now have a transparent interface to write custom data to the event file format that are then consumed by TensorBoard.
It is based on this foundation that we have developed MXboard, a python package for recording MXNet data frames and displaying them in TensorBoard. To install MXBoard follow these simple instructions.
Note: Please note that MXNet 1.2.0 is required to use all the features of MXBoard. Before the official release of MXNet 1.2.0, please install MXNet nightly version: pip install --pre mxnet
MXBoard Quick Start Guide
MXBoard supports most of the data types in TensorBoard:
The MXBoard API is designed to follow the tensorboard-pytorch API. All record APIs are defined in a class called SummaryWriter
. This class contains information such as the file path of the record files, the frequency of writing, the queue size, etc. To record a new data point of a specific data type, be it a scalar or an image for example, you only need to call the corresponding API on the SummaryWriter
object.
For example, we want to draw a data distribution diagram with a gradually decreasing standard deviation of normal distribution. First define a SummaryWriter
object as follows:
from mxboard import *
sw = SummaryWriter(logdir='./logs')
Then in each loop, we create an NDArray
with values drawn from normal distribution. We then pass the NDArray
to the summary writer add_histogram()
function, specifying the number of bin
and the loop index i
which will be the index of our data point. Finally, as with any file descriptors used in Python, it is good practice to close the file handle of the SummaryWriter
using .close()
.
import mxnet as mx
for i in range(10):
# create a normal distribution with fixed mean and decreasing std
data = mx.nd.random.normal(loc=0, scale=10.0/(i+1), shape=(10, 3, 8, 8))
sw.add_histogram(tag='norml_dist', values=data, bins=200, global_step=i)
sw.close()
In order to visualize the plotted diagram, on the terminal, enter the working directory, and type the following command to start TensorBoard:
tensorboard --logdir=./logs --host=127.0.0.1 --port=8888
Then enter 127.0.0.1:8888
in the browser's address bar. Click HISTOGRAM
and you will see the following rendering:
Real-world MXBoard
Using what we learnt in the above section let’s try to accomplish the following two tasks:
- Monitoring supervised learning training
- Get insights on convolutional neural networks inner workings
Training MNIST model
Let’s use the MNIST dataset from the Gluon vision API and let’s use MXBoard to record in real-time:
- The cross-entropy loss
- The validation and training accuracy
- Gradient data distribution
All of them are good indicators of the progress of the training.
First, we define a SummaryWriter
object:
sw = SummaryWriter(logdir='./logs', flush_secs=5)
The flush_secs=5
is added here to specify that we want to write the records to the log file every five seconds so that we can track the real-time progress of the training in the browser.
Then we record the cross-entropy loss at the end of each batch:
sw.add_scalar(
tag='cross_entropy',
value=L.mean().asscalar(),
global_step=global_step
)
At the end of each epoch, we record the gradient as HISTOGRAM
data type and record the training and test accuracy as SCALAR
types.
grads = [i.grad() for i in net.collect_params().values()]
assert len(grads) == len(param_names)
# logging the gradients of parameters for checking convergence
for i, name in enumerate(param_names):
sw.add_histogram(tag=name, values=grads[i], global_step=epoch, bins=1000)
name, acc = metric.get()
# logging training accuracy
sw.add_scalar(tag='train_acc', value=acc, global_step=epoch)
name, val_acc = test(ctx)
# logging the validation accuracy
sw.add_scalar(tag='valid_acc', value=val_acc, global_step=epoch)
Then we simultaneously run the Python training script and TensorBoard to visualize the training in the browser in real-time.
To reproduce this experiment, you can find the fully worked out solution code available here on Github.
Visualization of convolutional filters and feature maps
Visualizing the convolutional filters and feature maps as images is useful for two reasons:
- When training has converged, convolutional filters exhibits clear pattern detection features, lines and distinctive colors. Convolutional filters that do not converge or overfit the model will display a lot of noise.
- Observing the RGB rendition of filters and feature maps can help give us an understanding of the features that are learnt and considered meaningful for the network, typically edge and color detection.
Here we use three pre-trained CNN models from the MXNet Model Zoo, the Inception-BN , Resnet-152 , and VGG16. The filters of the first convolutional layer are visualized directly in TensorBoard, alongside the resulting feature maps when applied to a black swan image. Notice how networks can have different convolutional kernel sizes.
- Inception-BN
- Resnet-152
- VGG16
You can see that the filters of the three models exhibit pretty good smoothness and regularity, usual signs of a model that has converged. The colored filters are mainly responsible for extracting color-based features in the image. The gray-colored images are responsible for extracting general patterns and outline features of the objects in the image.
For the full implementation and further analysis, check the code here.
Visual image embedding
The last example is equally interesting. Embedding is a key concept used in several machine learning domains, including computer vision and Natural Language Processing (NLP). It is the representation of higher-dimensional data into a lower-dimensional space. In a traditional image classification setting, the output of the penultimate layer of a convolutional neural network is usually connected to a fully connected layer with a Softmax activation that is used to predict the class or category that the image belongs to. If we strip the network of this classification layer we are left with a network that outputs a vector of features for each example, usually 512 or 1024 features per example. This is called the embedding of our image. We can call to MXBoard add_embedding()
API to observe the distribution of the embeddings of the dataset projected down into 2D or 3D space. Pictures with similar visual features are clustered together.
Here we randomly select 2304 images from the validation, calculate their embeddings using Resnet-152, add the embedding to the MXBoard log file and visualize them:
The embeddings of 2304 images are projected on a 3D space using the PCA algorithm by default. However the clustering effect is not obvious. This is because the PCA algorithm cannot maintain the spatial relationship between the original data points.Therefore, we use the t-SNE algorithm provided by the TensorBoard interface to get a better visualization of the embeddings. Constructing the optimal projection is a dynamic process:
After convergence of the t-SNE algorithm, it can be clearly seen that the dataset is divided into several clusters.
Finally, we can use the TensorBoard UI to verify the correctness of the image classification. We enter “dog” in the upper right corner of the TensorBoard GUI. All pictures of the validation dataset classified as “dog” tag will be highlighted. We also see that the clustering derived from the T-SNE projection follows closely the class boundaries.
All codes and instructions available here .
Conclusion
After this MXBoard tutorial, we can see that visualizations are a powerful tool in supervising the training of models and getting insights in the principles of deep learning. MXBoard provides MXNet with a simple, minimally intrusive, easy-to-use, centralized visualization solution for scientific and production environments. Best of all, all you need to use it is a browser.
Special thanks to Zheng Zihao for providing technical support during the development of the project!