3 Reasons why AI Assisted Labeling will destroy Manual labor market

Published in

Supervisely

4 min readJul 31, 2019

It’s not a secret that the most time-consuming part of any computer vision project is data preparation, especially labeling. Moreover, it’s the most important part — without high quality training data even the most recent neural network architecture will fail to learn.

But as AI becomes widely accepted in many different areas and industries, annotation becomes more and more complex. Ten years ago labeling with bounding boxes for object detection was among the most popular annotation tasks. Today, there are tons of models that solve the task beautifully.

Semantic segmentation is the next challenging problem. Even a small picture requires much more attention and time from human laborer to annotate it on a pixel wise level — but even that is not the end!

And with each advance in AI, training data will become more and more complex and human involvement will require much more skills than today — and, in some cases, the time has come already.

Let’s name three main reasons why you should consider choosing AI assisted labeling solution over crowdsourced click farms for your next machine learning project.

Label “unlabalable”

Some images are just impossible to label with traditional tools like polygon or even brush. Consider hair segmentation. Modern smartphones actively use machine learning to separate people from the background. But how do you label so complex objects like hair? Some use advanced graphic editors like Photoshop, but it takes many hours to annotate just one photo. And what if we have a video with many frames?

Hair segmentation is an example of a really hard problem

Here at Supervisely we propose a new tool based on a neural network that does all the hard work. Labeler only needs to roughly point an edge of the interesting object and the model will make a precise mask with 255 shades of transparency.

It gets better as you go

Sometimes the shape of desired objects in not too complex, maybe you are even fine with getting bounding boxes. But what if there are hundreds of objects on just a single picture?

This is a common case in many areas like medicine, agriculture or factory automation. Manual annotation will take forever.

Manual labeling of every single bacteria is barely possible (photo by Magdalena Wiklund)

The solution here is so called Human-in-the-Loop concept. The idea is that human labeler only annotates a small portion of data, and the rest is done by AI. Neural Network will make it’s prediction based on already annotated examples, labeler will correct the results and the model will be re-trained again and again until it produce perfect training data.

Essentially, AI adapts to your task and improves over time.

Human-in-the-Loop requires a convenient annotation editor, data preparation framework and machine learning platform — and that’s exactly what we are doing in Supervisely!

Domain knowledge and privacy

Another downside of crowdsourcing is the lack of domain knowledge. This is a very frequent issues in medicine, but could happen in any area. As an example, one of our customers does semantic segmentation of a dental CT scans — and the problem is that only highly trained dentists (with six-figure salaries) can label those frames. Of course, some random guys from human-intelligence won’t be able to make accurate and correct annotation. You need a self-hosted platform.

There is a Sarcoidosis on this CT. Can you find it? Or do you even know what that is? Neither are labelers

Moreover, each minute of such a specialist is super valuable and costly. If we could speed-up labeling and make it easier with AI, this would save lots of time and money.

The solution

That’s why our goal in Supervisely is to provide AI-assisted labeling tools, designed for building high quality training data.

One of the examples is the “Smart Tool”. How it works? You put a bounding box or draw an edge on the interesting object and Neural Network behind it will produce perfect segmentation mask with 255 shades of transparency.