CVPR 2021 Deadline Looms: Speed Up Your Paper Writing With FiftyOne!

Tips and Tricks for how you can speed up tedious parts of paper writing to spend more time on method improvement and paper optimization.

Eric Hofesmann
Voxel51
6 min readOct 7, 2020

--

The CVPR 2021 deadline is just a week away! From my experience writing computer vision conference papers, the last month before the deadline is prime crunch time for finishing the analysis of your model and getting your paper written up. A number of issues frequently arise at the last minute, for example, you will likely need to:

  • Get results on more datasets; frequently ones that you’ve never used before. GAH!
  • Spend time wrangling data and your model outputs.
  • Visualize your model results to create that important page-1 figure.
  • Compare your model to baselines.
  • Find interesting examples for your paper.

Our new Python tool, FiftyOne, can help with all of these pain points!

Problem spaces that FiftyOne supports

FiftyOne can help you if you are working with image and video data in any capacity, below are some examples of core computer vision tasks generally relevant in today’s paper topics and that are guaranteed to be compatible with FiftyOne.

Classification

Classification involves training a model to predict one or multiple categorical labels for each image or video sample.

Image object detection

Object detection is the task of localizing and classifying objects in an image by training a model to produce a set of coordinates and labels for bounding boxes.

Video object detection

Video object detection has the same goal as image object detection except that objects also must be tracked from one frame to another. The additional data dimensionality makes it a lot more difficult to visualize these labels!

Semantic and Instance segmentation

Segmentations are often stored as multi-colored masks where each color represents a different object or class of objects. Semantic segmentation only looks to segment classes of objects while instance segmentation looks to also segment individual instances of a class.

Polylines/polygons/keypoints

Other detection problems include keypoint detection and segmentation using polylines and polygons.

How can FiftyOne help?

Visualize and analyze model results

Measuring higher than the state-of-the-art on some aggregate metric is definitely helpful in getting your paper accepted (by those pesky CVPR reviewers). However, that does not mean that you can skimp on the analysis of why your model is performing better.

The most surefire way for you to understand the performance of your model is to go through and look at how it performs on groups of samples and on individual examples. Visualizing your model performance is often a huge pain that involves far too much time scripting to just get your predictions visualized on images. My coworker tells me that until we released FiftyOne, he was using 20 year old scripts and tools like display and animate. That doesn’t even begin to consider the idea of actually parsing through thousands of images or video clips to get a complete story of your model’s performance. For that matter, even if you don’t care to understand why your model works, information like this is direct ammunition for your rebuttal come January!

Luckily, FiftyOne is designed to help alleviate this exact problem. The FiftyOne API is designed to make it easy to load in a dataset and add your model predictions. This allows you to use the FiftyOne App to visualize, browse, and search through your model outputs using any criteria you can think of. See the example below for how easy it is to load and search through predictions from a model:

Get results on more datasets

The FiftyOne API includes a dataset zoo that can help you quickly get a new dataset to run your model on. With one command, you can download and access any dataset in the zoo:

You can then generate predictions using your model and load them into FiftyOne. If you have a dataset you want to load that’s not in the zoo, the FiftyOne API provides easy functionality to load the ground truth and predicted labels either way.

Wrangle data and outputs — Supports more than 15 dataset formats

Parsing different datasets and model label formats can be a lot of work. Frankly, 3 people out of 5 tell us that they spend the majority of their experimentation time munging data into the required format for their code. That is unbelievable! I am sure you too generally spend a significant amount of time writing scripts to transform your data from one format to another.

FiftyOne can help in this regard by allowing you to import and export images and labels in numerous formats. For example, you can load in your detection model predictions in a custom format and export them in COCO format to run pycocotools evaluation. The following is a list of data import and export formats that FiftyOne supports:

Compare to baselines

We all know that it’s crucial to place your method’s performance in the context of baselines and the state-of-the-art. One way to do this is to show higher performance on an aggregate metric across a popular dataset. However, as I mentioned above, you generally also need to look at individual examples and different cases where your model performs better or worse.

FiftyOne is a great way to perform this analysis easily. The following code snippet shows how you can load multiple model classifications into a FiftyOne dataset:

It is natural to extract comparative samples for your paper. For example, if you want to see the samples where your object detection model detected more than 10 true positives while the baseline predicted more than 10 false positives you can use the following snippet:

This tool makes it really easy to compare your results to baselines because it is able to quickly manipulate your data for you.

Find examples for your paper

Generating images exemplifying your model’s performance can be tedious. You first have to sift through countless results to find samples that tell the story that you want. You then have to format and edit the images to fit in your paper.

FiftyOne is designed to let you quickly search through your dataset to find samples in line with various aspects of your model’s performance. For example, if you are working on a detection problem, you can write a short query to find examples of how your model performs on small objects:

Another example is if you have a facial recognition dataset and want to compare how well your model performs at recognizing men versus women:

With regards to the editing images for your paper, we have a recipe in the FiftyOne documentation showing how it can be used to draw labels and model predictions on individual samples. You can select images you want and then use the following snippet to generate and save annotated images to disk:

Summary

Writing papers is a necessary part of research in computer vision and machine learning. Some people love it; some people hate it. In any case, understanding and substantiating your model performance will real output and visualizations is critical, but often so time-consuming to generate. FiftyOne is a lightweight and easy to use visualization tool with powerful querying capabilities that can let you get a better understanding of your model while at the same time comparing it to the state-of-the-art and producing qualitative paper-ready examples. Give it a try next time you are working on a paper submission!

--

--

Eric Hofesmann
Voxel51

Machine learning engineer at Voxel51, Masters in Computer Science from the University of Michigan. https://www.linkedin.com/in/eric-hofesmann/