Building a dataset of water meter readings in Toloka

Published in

Toloka Tech

8 min readApr 5, 2021

Hi! I’m Roman Kucev, a data scientist at Neatsy, Inc. Two years ago, I came across an interesting news episode on TV. They announced that the Department of Information Technologies was building a neural network to recognize water meter readings in photos. The newscaster asked the viewers to help train the neural network and send photos of their water meters to an online portal.

I thought, OK, if you’re a big government agency, you can easily roll out a video on TV and ask people to send photos of their water meters. But what if you’re a small startup and can’t afford that sort of reach? What could you do to collect five thousand images of water meters?

Well, it’s the perfect reason to use Toloka!

Toloka is a crowdsourcing platform where people from all over the world complete micro-tasks and get paid for it. Performers on Toloka, also called Tolokers, can help identify pedestrians in photos, train voice assistants, check search result relevance, and much more. Anyone can join Toloka as a performer or as a requester.

The goal

So, you want to build a neural network that recognizes water meter readings in photos.

To make an MVP, you’ll need one thousand photos of water meters. You’re looking for two parameters: the current reading and the digits’ position on the meter.

Part 1. Collecting images

At first glance, it’s all plain and simple. You create a task in Toloka and ask the performers to open the app on their phone and take a picture of their water meter. If I didn’t have several years of experience working with Toloka, my instructions would look something like this: “Take a picture of your water meter and send it to us.”

Unfortunately, it’s impossible to get a high-quality dataset with that sort of task description. The problem is that it doesn’t really say what a valid submission should look like. As a result, people may submit content that can’t be used to train a neural network, such as:

Blurry images.
Images where you can’t clearly see the readings.
Multiple meters in an image.

Toloka has an excellent tutorial on writing instructions. I followed their advice and put together these instructions:

Now, let’s move on to setting up the task itself. It involves only a few steps.

Set the task ID as the input parameter, and an img file as the output.

The task interface is just two lines of code!

2. Next, create a pool and do the following:

Specify how much time the performers have to complete the task.
Set non-automatic acceptance.
Determine how much you’ll be paying for the task. In the example below, it’s $0.01.

3. To make sure performers don’t cheat by sending the same pictures over and over again, prohibit repeat submission in quality control settings.

4. Determine what kind of performers you’re looking for. In this case, it’s Russian-speaking performers who use the Toloka mobile app.

5. Upload the task to the pool.

Part 2. Validating the images

In a few hours, Tolokers will complete your task. Since you set non-automatic acceptance, they won’t get paid right away: first, you need to check if their submissions are valid. You’ll need to accept valid submissions and reject invalid ones, providing the reason for rejection.

Remember, the goal is to get tens of thousands of photos. Imagine checking every single one of them! It would take a huge amount of time and effort. Thankfully, you don’t have to do yourself.

You can create a new task and ask another group of Tolokers to determine whether each image meets your quality criteria. Let’s call it “Water meter image verification”.

Again, you’ll need to follow a few steps.

Define what constitutes a valid photo.

A photo is valid if:

It shows only one water meter (for either hot or cold water).
The readings are clearly visible.

If either of these criteria aren’t met, consider the photo invalid.

2. Write clear instructions.

3. Specify the image URL as the input parameter. On the output side, you’ll have two “yes” or “no” parameters:

check_count — the answer to the first question.
check_quality — the answer to the second question.

The value variable contains the meter reading.

The interface for this task is longer –14 lines of code.

4. To increase accuracy, set up an overlap of 5, meaning you’ll have five Tolokers checking each image independently. You’ll then look at their responses and take the most frequent response as the correct one (this quality control method is called “majority vote”. More on that in a bit). This task doesn’t have non-automatic acceptance.

5. Make the task available to the top 50% of performers.

In tasks without non-automatic acceptance, everyone gets paid regardless of whether they complete the task correctly. But you want the Tolokers to do a good job. How do you achieve that?

Quality control

Toloka has two main tools for maintaining good quality:

Training. Before completing the main task, you can ask the Tolokers to undergo training. In the training pool, performers are given tasks to which you know the correct answers in advance. If a performer answers incorrectly, they’re told that it’s an error. They’re also shown the correct answer. After the training is finished, you see the percentage of tasks that each performer completed successfully. Based on that, you can make the main task only on available to performers with the highest success rate.
Quality control rules. Sometimes we find ourselves in a situation where a performer completes the training with flying colors, gets access to the task, but then immediately leaves to play football and gets their three-year-old brother to sit at the computer and complete the actual tasks for them. Luckily, Toloka has a great number of tools you can use to monitor performers’ actions and check task completion quality, like majority vote or control tasks.

Setting up the training pool is simple. All you need to do is add the tasks, set them up in the Toloka interface, and specify the threshold above which to admit performers to the main task.

Majority vote

We give the task to five independent people. If four people respond with a “Yes” and the fifth responds with a “No”, the fifth person is probably wrong. This way, we can see whether a Toloker’s responses are in line with the others’, and ban performers whose responses differ.

Control tasks

You can mix things up and include tasks which you already know the correct answers to in the pool. That way, quality control tasks look exactly the same as regular tasks. Based on whether a person does the control tasks correctly, we can decide whether they are completing the other tasks (the ones you don’t know the answers to) correctly. If a person gives invalid responses in control tasks, you ban them, and if the person gives valid responses, you give them a bonus.

Here’s what the verification task looks like for the performer:

Part 3. Combining the tasks

Now that both your tasks are ready, you need to link them together, so that the second task is launched after the first one.

You could do it by hand in the interface, but there’s a far better option. You can use the Toloka API and a Python script.

All that’s left to do is run the code and get the result you’ve been waiting for: a dataset of 871 water meters! It’s actually pretty amazing: you configure the project once and get a fully automated data collection and validation process. What’s more, the data collection is easily scalable — you can increase the size of the dataset in just a few clicks.

How much does it all cost?

In my example, we offer $0.01 for each image submitted in the first task.But there’s a catch: if you’re offering your performers $0.01, you actually end up paying $0.018 per submission.

Here’s why:

Toloka charges a commission of 20% but no less than $0.005. For a task priced at $0.01, the commission will be 50%.
20% VAT.

You pay performers $0.01 for verifying 10 images of water meters. But bear in mind that one image is checked 5 times by 5 different people. In total, the amount you spend per image is (0.01 x 5/10) x 1.2 x 1.5 = $0.009.

Let’s say that out of one thousand image submissions, you accepted 871 and rejected 129. All in all, to get an 871-image dataset, you need to spend $0.018 x 871 + $0.009 x 1000 = $25. Definitely cheaper than launching an ad campaign on TV!

You can reduce the price even further. Here are a few ways:

Offer the performers in the first task to take several photos instead of just one, and raise the payment amount. Toloka’s commission will then be 20% instead of 50%.
Use dynamic overlap in the second task. If 4 out of 5 performers gave the same answer, there is no need to pass the task on to the 5th performer.
Work with Toloka as a foreign entity so that you don’t have to pay VAT.

P.S.

I realize this article may seem like it was sponsored by Toloka, but I can assure you it’s not. I didn’t get paid by Toloka and I don’t think I ever will. I just wanted to use a fictional but relevant and interesting example to demonstrate how this crowdsourcing platform allows you to quickly and inexpensively create a dataset for any task, be it kitty image recognition or training autonomous vehicles.