Detecto — An object detection library for PyTorch

Simplifying the process of creating custom-trained object detection models

Alan Bi
Alan Bi
Apr 17 · 5 min read
Image for post
Image for post
A model trained using Detecto

Detecto is a Python library built on top of PyTorch that simplifies the process of building object detection models. The library acts as a lightweight package that reduces the amount of code needed to initialize models, apply transfer learning on custom datasets, and run inference on images and videos.

Getting Started

To see how simple it is to get started with Detecto, let’s load in a pre-trained model from torchvision’s model zoo and run inference on the following image:

Image for post
Image for post
Source: Wikipedia

First, right-click and save the image above to a folder on your computer, and then make sure you’ve downloaded Detecto by running pip3 install detecto. Afterward, run the following script from within the same folder:

The code above reads in the saved image (in my case named “fruit.jpg”), generates predictions on it from a pre-trained model, and plots the results:

Image for post
Image for post
Cropped from original image for better visualization

Detecto’s Model class is built on a Faster R-CNN ResNet-50 FPN architecture from torchvision’s models subpackage, which is pre-trained on the COCO 2017 dataset. By default, it can detect about 80 different objects such as fruits, animals, vehicles, kitchen appliances, and more.

Of course, if all you wanted to do is use a default model, there isn’t much need to use a dedicated package. However, if you want to train a model on a custom dataset, that’s where Detecto comes in.

Transfer Learning

There are a couple of tutorials out there that teach you how to use a pre-trained model and apply transfer learning on a custom dataset. However, in many of these scenarios, developers have to define custom classes for their dataset, make modifications to the pre-trained model, or write their own training and visualization methods from scratch. Sometimes, all you want is to quickly whip up some good results. Luckily, doing so with Detecto is easy.

To start off, Detecto comes with a Dataset class (extending that of PyTorch’s) that accepts any data in the PASCAL VOC format; i.e. each image has an associated XML annotation file (here is a great labeling tool for this format). To see what this would look like, you can have your dataset in either of the following formats:

# All images and XML files in the same folder:images/ 
| image0.jpg
| image0.xml
| image1.jpg
| image1.xml
| ...

# Images and XML files in separate folders:
images/
| image0.jpg
| image1.jpg
| ...
labels/
| image0.xml
| image1.xml
| ...

In both cases, reading in your dataset is as simple as the following:

As you can see, you can then index your dataset to get corresponding image-target pairs, which contain information on object labels and locations within each image. This importantly provides a structured data format for training, which can then take as few as four lines of code:

In the above example, after loading our dataset from the “images” folder, we initialize a Model with a list of classes ['alien', 'bat', 'witch'] telling it what we want to predict. Then, we call fit, which will fine-tune the pre-trained model to learn how to detect our custom objects.

Now, let’s run the model on an image and print out the results:

Output:

['alien', 'bat', 'witch']
tensor([[ 569.2125, 203.6702, 1003.4383, 658.1044],
[ 276.2478, 144.0074, 579.6044, 508.7444],
[ 277.2929, 162.6719, 627.9399, 511.9841]])
tensor([0.9952, 0.9837, 0.5153])

Here, our top prediction was an alien with coordinates [569, 204, 1003, 658] and a confidence of 99.5%. Let’s also plot our predictions:

Image for post
Image for post

Detecto’s visualize module comes with many other visualization methods, including video detection and live camera feed. Here’s what inference on a video looks like:

Image for post
Image for post

Once you’re done working, you can save and load your models to a .pth file in typical PyTorch fashion:

Advanced Usage

Detecto is great for quickly creating object detection models, but that doesn’t mean it’s limited in functionality either. An important part of object detection is data augmentation: applying artificial transformations to images in order to increase the diversity of the dataset. Because Detecto sits on top of PyTorch, developers can make use of the torchvision transforms module to augment their datasets:

In this example, we describe a series of transformations to apply to our dataset. As we get ready to train another model, we also define a DataLoader object to customize how the fit method should iterate over our dataset, which we call in the next step:

After passing in the DataLoader, we provide a validation dataset to track performance throughout training, as well as customize a multitude of other parameters. Below is the loss against the validation dataset at each epoch:

Image for post
Image for post

All in all, Detecto is still a lightweight library, so after training a model, you may need finetuning capabilities that are not yet supported. Thankfully, you don’t need to limit yourself to Detecto’s API: simply use the get_internal_model method to access the underlying PyTorch model, which you can then integrate into your code as if it were any other PyTorch model.

Conclusion

In this article, I introduce Detecto and show how it can be used to make object detection with PyTorch dramatically easier. To learn more, check out these resources:

Please don’t hesitate to reach out with any questions or submit an issue!

PyTorch

An open source machine learning framework that accelerates…

Alan Bi

Written by

Alan Bi

Student at Duke University studying computer science and statistics

PyTorch

PyTorch

An open source machine learning framework that accelerates the path from research prototyping to production deployment

Alan Bi

Written by

Alan Bi

Student at Duke University studying computer science and statistics

PyTorch

PyTorch

An open source machine learning framework that accelerates the path from research prototyping to production deployment

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store