Active Learning with PyTorch

Learn how to save up to 50% on your data labeling costs with this step-by-step image classification tutorial

Published in

Scaleway

8 min readMay 5, 2020

This blog post is the continuation of Active Learning: the Theory, with a focus on how to apply the said theory to an image classification task with PyTorch. It was first published on May 5, 2020 on Scaleway’s official blog and is reposted here for your convenience.

Introduction

In part 1 we talked about active learning: a semi-supervised machine learning approach in which the model figures out which of the unlabelled data would be most useful to get the labels for.

Active Learning: the Theory

Active learning is still a niche approach in machine learning, but that is bound to change.

medium.com

As the model gets access to more (data, label) pairs, its understanding of what training samples are most informative grows (at least, supposedly), allowing us to get away with fewer labeled samples without compromising the model’s final performance. The hardest part of the process is determining the aforementioned informativeness of the unlabeled samples. The choices are dictated by the selected query strategy, the most common strategies having been discussed in the previous post.

Active Learning with PyTorch

Learn how to save up to 50% on your data labeling costs with this step-by-step image classification tutorial

Introduction

Active Learning: the Theory

Active learning is still a niche approach in machine learning, but that is bound to change.

Written by Olga Petrova