Visual, More Intuitive Means to Debug Datasets and Machine Learning Models

Practical features to identify and correct problems early on in your AI workflow.

Jason Behrmann
Zetane
2 min readJun 8, 2020

--

Display of 25 million tensor components at 10 frames per second in Zetane. You can project tensors and their operators in 2 or 3 dimensions, which facilitates inspection, understanding and debugging of your models. When displayed in the Zetane environment, you can navigate these structures with ease and access the statistics associated with specific tensor parameters with one click.

Identifying and isolating the source of weaknesses in your datasets and complex machine learning models is arduous and time-consuming.

Rather than peruse through lines of python code, Zetane enables you to sift through your data using several features, many of which transform your datasets into easy-to-scan visuals. You can also run simulations to assess if your models provide reasonable and expected outputs, as well as gain insights into how the model may behave in the real world. Here we provide an example for both of these features that streamline debugging tasks throughout your machine learning workflow.

Is there too much emphasis on the background colour?

You can use tSNE algorithms and unsupervised learning techniques to see how your images will organize by prominent features in the dataset. Here is an example using the CIFAR-10 dataset, where the tSNE algorithm grouped images primarily by the background colour and not the content of the photos. Conducting this assessment early-on in your data processing will save you a lot of frustration further along in your workflow. If you want your image-recognition models to not focus on this frivolous feature, you could try converting the data into black-and-white images, for instance.

Simulations to see if your AI models are working as they should

This example pertains to developing an image-recognition model capable of identifying obstructions on train tracks. Identifying an obstruction will signal the train to stop. At this time point (t= 1:58) in the video below, we show a simulation of a train and the model in operation. As the simulated train approaches the obstruction, it stops; closer inspection raises questions as to whether the model recognizes the boulder as an obstruction or if it learned to also see the black opening to the tunnel as an obstruction. This indicates the need for further testing and to include images of tunnels labelled as non-obstacles into the training dataset.

--

--

Jason Behrmann
Zetane

Director of Marketing and Communications at Zetane Systems Inc.