How to setup an image recognition task properly? — 2018 Update

Víťa Válka
Jul 20, 2018 · 4 min read
Binoculars

The Basic Rules

  • Binary classification striped/not striped get 50–100 img/label
  • Up to 20 labels for hard to recognize labels get ~100 img/label
  • Up to 100 labels for well-defined labels get 100–200 img/label
  • Pattern recognition structures, x-ray images get 50–100 img/label
  • Abstract labels up to 20 categories get ~100 img/label

What does not Work

  • Multiple labels with small dataset — over 20 labels need at least 100+ images per label to achieve solid results.

Reliability of the Results

Every client is looking for reliability which is equal to accuracy. Stay simple if you aim to reach high accuracy. Technology is still pretty dumb. Building an image classifier with a limited number of training images needs an iterative approach at this moment. I recommend to follow rules below.

  • Make categories smaller & connect them in some logical manner
  • Use general models for general categories
  • Each label should have similar amount of images
  • Always collect images to extend your dataset
  • Merge very close classes together
  • Use UI/human feedback to improve the data
  • Maintain quality of your dataset

Testing & Production Difference

We allow users of Vize.ai to train tasks with a minimum of 20 images per label. By dividing your data to training & test set, Vize uses training set for learning optimal parameters for the classifier. We augument these images, during the training, in several ways to extend the set of images by automation. Test set is used for computing the accuracy of the classifier — the accuracy which you can see in Vize app on the Task screen.

Best Practises

Start with Fewer Categories

Building an app for people to recognise shoes I recommend to start with ~50 shoe types. This is easy to train task with 100 images of each shoe. Let users add and upload new shoe in the user interface. Also, let them give you feedback for your classifications. This way you can get an amazing dataset of real images in one month and then update your app.

Use Tasks with Less Categories

Building a classifier for plane types with small training dataset, separate your images into “in the air” and “on the ground” images. Build two different models for air and ground and get better overall results for both. You can even merge similar planes to one class and train another recogniser to sort them out. Once you have more images you can merge these categories together.

Use Binary Classifiers for Important Classes

Creating captions for images in e-commerce? Build custom task for each tag. One model will classify “rounded” “not rounded” etc. This way you get very reliable specialised classifier for each tag.

Don’t Mix the Input Images

Machine learning performs better if the distribution of training and evaluated pictures is the same. This means you need to have same images for training as the ones you are going to evaluate. You can hack this, using internet images in the beginning but you should start gathering user imagery as soon as possible. These rulers are going to make your model robust in the future.

Summary

Building image classifier is not only hard in a matter of good deep learning task but also good task definition and good dataset. If the size of the dataset is challenging, start simple and iterate towards your goal. If you have any questions feel free to text me or comment below.

Try Vize for free at app.vize.ai

Vize — custom image recognition blog

Official blog of Vize.ai. We write articles about image recognition, deep learning and artificial intelligence. At Vize we help businesses to extract actionable value from their images.

Víťa Válka

Written by

User interface designer who convinced his family to switch from a house to a travel trailer. #digitalnomad

Vize — custom image recognition blog

Official blog of Vize.ai. We write articles about image recognition, deep learning and artificial intelligence. At Vize we help businesses to extract actionable value from their images.