Using PixPlotML, a visualization tool for object detection and image classification of Machine Learning projects

Published in

Data Science at Microsoft

11 min readJan 31, 2023

Visualization tools are essential for data scientists working on object detection and image classification projects. By visualizing the entire dataset, including the images themselves and their labels, one can more easily identify patterns and relationships and help confirm or refute hypotheses about how the data behaves.

However, checking large numbers of images to get a sense of the overall problem is a challenge. Additionally, I’ve seen technical teams, both customers and non-experts, overly focus on individual cherry-picked images that may not represent the most critical problems. So, the ability to visualize many thousands of images in a single, interactive view that is accessible to technical and non-technical users is crucial for keeping the whole team focused on the fundamental problems affecting model performance.

To help solve this problem, we’ve enhanced Yale DH Lab’s PixPlot application. We’ve augmented this software to include features necessary for analyzing images for object detection and classification problems.

We call this software: PixplotML. The enhancements include the following:

Added features to identify images by their label, including a legend and adjustable image borders indicating image class.
Added features to change and modify image labels and download the updates.
Added features to identify and flag images for removal.
Overhauled the installation and deployment to be more straightforward — including a dockerized option. We’ve also removed many dependencies that are unnecessary for use in Machine Learning projects.
Added code to train an image classifier on your own data to fine-tune the visualization.

Possibly the most significant enhancement is the ability to fine-tune the visualization to your own data. As you will see in the following sections, these enhancements combine to provide a powerful model debugging tool showing technical and non-technical users the root causes of model performance problems.

The next sections show how to use PixPlotML for Object Detection projects. This includes fine-tuning it to your own data and then performing the following activities to identify and fix issues affecting model performance:

Reviewing the visualization of images and labels to get a general sense of the dataset and effectiveness of the labeling strategy.
Showing how to quickly find and fix issues such as misclassifications and confusing images.
Identifying the cause of false positives by visualizing them within the overall dataset.
Debugging model performance problems such as class confusion.

What is PixPlotML?

PixPlotML is a web-based tool that provides an interactive and zoomable visualization of your entire dataset. It’s optimized for viewing thousands of images in one interactive view and is easy to use for technical and non-technical users alike.

Figure 1: A UMAP visualization of 1600 bounding boxes from the COCO validation dataset; the center image shows the same images but ordered by label name, and the image at right shows that we can navigate the view and take a close-up of different sections of the visualization.

The fine-tuning process and Uniform Manifold Approximation and Projection (UMAP) visualization ensure that images that look similar are located next to or near each other, so areas of the visualization form recognizable clusters. In addition, adding image labels makes it easy to spot errors, such as images that are misclassified, or regions where the model is likely to perform poorly due to confusing images. We show how to find these issues in the Using PixPlotML for Object Detection Problems section.

In addition, labels can be updated and downloaded, images potentially affecting model performance can be selected, and details downloaded for further processing.

Object detection evaluation

You may evaluate your object detection model’s performance on your dataset using metrics such as mean Average Precision (mAP). Such evaluation metrics are essential, but they can be challenging to use for interpreting results or forming hypotheses about what is happening.

In this article, I explain how to use PixPlotML to gain more intuition about data problems related to image datasets and how to enable users and customers to fix them. PixPlotML should be combined with a robust evaluation and error-analysis process to thoroughly debug model performance. To understand more about deploying robust evaluation and error-analysis pipeline code, I recommend the following article:

Error analysis for object detection models | by Bernat Puig Camps | Data Science at Microsoft | Medium

The code

You can find the code, quick start guides, and related deployment instructions in the GitHub repo here.

The repo contains two primary features:

The PixPlotML server is based on the original PixPlot from the Yale DH Lab. We have added tools helpful for labeling, such as a legend, border colors representing the label, and functionality to update labels or flag images for removal.
A preparation step to customize the visualization to your image data. This step consists of using Pytorch-Accelerated to train a PyTorch classification model on your images and then output an image vectors file for clustering by PixPlotML. The code is self-contained, and the defaults are acceptable for various datasets. Just pointing the code at your images folder and metadata.csv file will be all that’s required to successfully run the process and see great visualizations!

Let’s get started using PixPlotML to analyze and fix problems with the Comon Objects in Context (COCO) dataset.

Getting started

The PixplotML repository contains a quick start example. It consists of a pre-packaged zip file containing six classes from the COCO validation dataset. Each image is the bounding box extracted from the primary COCO image. The dataset was chosen merely as an example as it is familiar to many people. However, COCO is a very clean dataset, while real-world datasets tend to be much messier (for example, images are less clear, there are haphazard bounding boxes and labeling strategies, and the objects in the image are small, among other issues). Because COCO is such a clean dataset, it is difficult to find errors. So, we know if we can find errors using PixplotML, the tool is doing a good job.

The zip file is in the data folder and contains the following:

A metadata.csv file, which is a simple file containing a row for each image in the image folder and its label.
An images folder with images, with one for each line in the metadata.csv file.
An image vectors file called ‘image_vectors.npy’. This file was created using the fine-tuning process. It consists of a Numpy array of size [num_images, 2048], which is the output of the classification model when applied to all the images. (Note that we use Pytorch-Accelerated simplify model training).

The folder structure looks like this:

To run PixPlotML to see the same views as in this article, follow the instructions in the README.md of the GitHub repository.

Fine-tuning the visualization to your data

Fine-tuning your data is a vital step that hugely improves the visualization and ensures that similar images appear very closely together, revealing clearer clusters of similar images.

The simple and quick process is described in the prep_pixplot_files folder in the PixPlotML GitHub repository here. Simply create a Python environment with the provided requirements.txt, and then run the Python script providing the location of the folder with the metadata.csv and the images folder. Here is the code for this step:

# Create the python environment
cd prep_pixplot_files
conda create -n prep_pixplot python=3.9
conda activate prep_pixplot
pip install -r ./environment/requirements.txt# Run the train the image classifier and output the image vector file.
python main.py - data_fldr ../data/outputs - img_fldr ../data/outputs/images

Running the command looks like this:

The output to standard out from running the fine-tuning process.

After running this process, the output should look like the above image. The fine-tuning process runs 20 times through the metadata.csv file and then outputs the image_vectors.npy file.

Using PixplotML for object detection projects

Object detection projects are typical tasks that are required for forming intuition and hypotheses about data and model performance. They can be carried out at different stages of a project. By using the COCO bounding box images we highlight below a few methods and investigations that can help improve your results.

1. Understanding the problem by visualizing the entire dataset

At the beginning of a project, we can quickly identify how a model might perform by visualizing all the images based on how similar they look. For example, in Figure 2, we can see that the clusters represent the labels very well with very few exceptions. All the images that appear similar are in the same clusters and match the given label. If there were areas of confusion, we would see images with different labels mixed together. This can occur with poor labeling practices or where the labels could be better chosen. For example, this could happen if two different label types have images that appear very similar.

But in this case, everything looks good. I would be thrilled to see the visualization in Figure 2 in a real-world project!

Figure 2: The UMAP visualization showing the benefit of fine-tuning the visualization to the COCO data. The clusters are well separated and represent each of the classes.

2. Looking for misclassifications

The UMAP visualization shows all the images organized by how similar they appear. The border color of each image represents the label. As a result, we can quickly identify mislabeled images by seeing which images have an incorrect label.

For example, see Figure 3 below. The teddy images are all clustered together, so we can easily see when an image is mislabeled by ensuring all the border colors match the images. As you can see, most images are correctly labeled with a purple border, but the middle image has a yellow boarder. Looking more closely, the image is of a stuffed dog, which may lead someone to believe that the label is correct; however, because it is a stuffed animal, PixPlotML is correct, and we should modify this label to be “teddy”.

Figure 3: A classic mislabeling error. The teddy in the center is labeled as a real dog but should be labeled as a teddy. We can easily fix this in PixPlotML by selecting the image and changing the label. The updates can be downloaded using the download button at the top middle of the window.

3. Looking for confusing images

Figure 4 below shows two screenshots. On the left is a correctly labeled dog image. The colored hair, the white fur and the dog’s face looking directly at the camera are very similar to nearby images. This is a genuine model error unrelated to the labeling strategy or image quality.

The image on the right shows a confusing image of a dog just off the edge of the image, so only half the dog appears, with black padding. There are also two vertical long thin bounding boxes where it’s challenging to identify the object.

If we expect these cases in production data, we need to add more similar cases. If they are not expected in production data, these examples are unlikely to be helpful and should be removed. We also must consider whether we can identify the content of the images. If we cannot, then the images need to be improved or removed for examples where visual inspection can easily determine the content.

Figure 4: A dog appears in the teddy cluster — as a result, the model is likely to classify this dog as a teddy. On the right we have a misclassified cat, a confusing dog image, and a couple of narrow bounding boxes where it’s difficult to see what’s in the image.

4. Improving the labeling strategy

By inspecting the teddy cluster of images, we’ve identified a group of images at the bottom right of the cluster with children holding a teddy bear, which we can see in Figure 5 below.

Unfortunately, some of the teddy bears are tiny and hard to see. So, are these images intended for detecting teddy bears? Or would these images be better for training a model to detect children?

This is a case where a good labeling guide shows edge cases and easy-to-label images. The edge cases inform the labelers, enabling consistency and ensuring that only images relevant to the task are included.

Figure 5: In this area of the Teddy cluster, there are several images of children holding teddies. But the size of a teddy can be quite small. Should these images be in a training set, or should they have a different label?

5. Investigating false positives

Sometimes a model predicts many more detections than we expect, resulting in low precision. We need to determine whether these are genuine detections missing from the validation set or whether the model is truly in error.

To investigate, we can create a new visualization by combining false-positive predictions with the ground truth. This is done by adding the false positives to the metadata.csv and images folder with a new label name. By doing this, we can see where the false positives appear within the visualization and the ground truth images near to them, so that we can determine whether they belong to that label or whether they form their own clusters. See Figure 6, below, for an example of how this appears in PixPlotML.

Common issues include not labeling all examples and generally needing consistency in applying labeling rules. Another issue is in regard to label types that contain images that visually appear too similar. Sometimes, false positive predictions can cluster together suggesting that a new label type might be needed.

Figure 6: The object detection model has identified a dog that is not in the labels.

In Figure 6, the legend now has the dog_fp label. All dog false positives appear with this label, and we can see a dog in the visualization within the cluster that has the dog_fp label. As the image is actually that of a dog, we can say this false positive is a missing label from the evaluation set.

6. Bounding box shape and size

As expected, the COCO bounding boxes are reasonably consistent. On a real-world project, inconsistent box sizes result from multiple labelers without explicit instructions. It’s essential to define, with cases, how to label each object. For example, a document containing a list of visual examples with additional instructions on labeling each edge case is beneficial to ensure consistency.

Figure 7: We have a couple of narrow bounding boxes where it’s difficult to see what’s in the image. We see some bounding boxes that focus on the face of the cat or miss out some parts of the cat.

7. Interpreting confusion matrix results

A confusion matrix shows the performance of each class. If the model appears to confuse some classes, then reviewing the UMAP visualization for those classes usually reveals the issue. This can be due to images that appear similar across the classes, images for which it’s difficult to discern the correct label, or an invalid choice of labels and an invalid labeling strategy.

8. Correcting labeling errors and/or flagging images for removal

To fix image labels in PixPlotML, we simply click on an image. A view then appears, offering options. In addition, there is a default ‘Remove’ label, which flags an image for removal when selected. The updates can then be downloaded using the download button at the top of the window.

In addition, if you find areas or groups of images of interest, you can select them using the lasso tool at the bottom right of the screen. This window also provides an option to download the selected image details.

Figure 8: Selecting an image reveals the view page. The bottom right of the screen allows a user to correct a label or use the default ‘Remove’ label for flagging for deletion.

Summary

We’ve seen how to use PixPlotML to visualize an image dataset, get a sense of the overall dataset, identify and fix labeling problems, improve labeling instructions, identify confusing images, investigate low precision, and investigate model confusion!

We’ve seen how fine-tuning a classification model is very helpful in improving the visualization and is an essential prerequisite for getting the best results.

We hope you also find PixPlotML helpful in your image-based Machine Learning projects!

Acknowledgments

This work is heavily based on the original PixPlot by Yale’s DHLab. PixPlot is a fantastic piece of software that enables visualization of thousands of images interactively.

The fine-tuning of PixPlotML using model training makes heavy use of Pytorch-Accelerated. Pytorch-Accelerated makes it quick and easy to train all kinds of models from Timm and PyTorch without having to write all the boilerplate code for the training loop. For further information please see the article Introducing PyTorch-accelerated | by Chris Hughes | Towards Data Science.

Alexander Hocking is on LinkedIn.