Visual, More Intuitive Means to Debug Datasets and Machine Learning Models

Practical features to identify and correct problems early on in your AI workflow.

Published in

Zetane

2 min readJun 8, 2020

Identifying and isolating the source of weaknesses in your datasets and complex machine learning models is arduous and time-consuming.

Rather than peruse through lines of python code, Zetane enables you to sift through your data using several features, many of which transform your datasets into easy-to-scan visuals. You can also run simulations to assess if your models provide reasonable and expected outputs, as well as gain insights into how the model may behave in the real world. Here we provide an example for both of these features that streamline debugging tasks throughout your machine learning workflow.

Is there too much emphasis on the background colour?

You can use tSNE algorithms and unsupervised learning techniques to see how your images will organize by prominent features in the dataset. Here is an example using the CIFAR-10 dataset, where the tSNE algorithm grouped images primarily by the background colour and not the content of the photos. Conducting this assessment early-on in your data processing will save you a lot of frustration further along in your workflow. If you want your image-recognition models to not focus on this frivolous feature, you could try converting the data into black-and-white images, for instance.

Simulations to see if your AI models are working as they should

This example pertains to developing an image-recognition model capable of identifying obstructions on train tracks. Identifying an obstruction will signal the train to stop. At this time point (t= 1:58) in the video below, we show a simulation of a train and the model in operation. As the simulated train approaches the obstruction, it stops; closer inspection raises questions as to whether the model recognizes the boulder as an obstruction or if it learned to also see the black opening to the tunnel as an obstruction. This indicates the need for further testing and to include images of tunnels labelled as non-obstacles into the training dataset.

Try out these features yourself and sign up as our next beta tester here.

Learn more about our other features for more explainable AI models

Explainable AI: look inside and demystify black box algorithms

Check out Zetane’s features to see the inner workings of your convolutional neural networks

medium.com