Calculate the Expense of Data Labeling for AI Projects: Annotating Alone

Mariia Krasavina
CVAT.ai
Published in
6 min readJul 23, 2024

Creating computer vision AI systems demands meticulous training and fine-tuning of deep learning (DL) models using image and video annotation (also known as data labeling). These annotations are essential for developing AI products to perform accurate analysis and prediction and generate reliable results. However, image annotation significantly impacts the overall cost of producing such systems.

“Instead of focusing on the code, companies should focus on developing systematic engineering practices for improving data in ways that are reliable, efficient, and systematic. In other words, companies need to move from a model-centric approach to a data-centric approach.”

— Andrew Ng, CEO and Founder of Landing AI

How can you calculate the optimal price for image annotation to include in your budget?

We’ll explore various factors influencing the cost of image and video annotation. More importantly, we’ll discuss why the price of image annotation should not be your only consideration when training and fine-tuning computer vision models.

Can Image Annotation Address Real-Life Issues? Understanding the Importance of Annotation

To better understand the dynamics of daily life, let’s consider a common scenario: life at home. Most of us live in houses, often not alone but with families. These families can vary in size and composition — ranging from small units to large, bustling households with children, pets, and elderly members who require special attention and care.

This variety can lead to issues relevant to all living areas: children might leave toys like LEGO pieces scattered on the floor, elderly individuals may misplace their glasses or other medical devices and struggle to find them, and pets could shed fur or leave other surprises around. All these factors contribute to a household’s everyday chaos.

Several solutions are available, such as automatic vacuum cleaners and electric mops. However, let’s consider that these devices might not be as bright as we need them to be. As a scientist leading a small research team, you aim to introduce an innovative product to the market — a smart home assistant robot. This advanced robot will differentiate between actual dirt and valuable items. It will clean up the former and signal the latter’s presence, aiding in retrieving lost items. This functionality will keep homes cleaner and make it easier to find misplaced objects.

For research purposes, the scientist and their team have gathered a dataset comprising 100,000 images of various rooms with items scattered on the floor. The volume of 100,000 images comes from the average batch size typically seen in robotics projects. This number is supported by available datasets in the public domain, where images usually range from 10,000 to several million per dataset. Let’s assume that one image, on average, has 23 objects. So annotating an average of 2,300,000 objects (or slightly fewer or more) would be best.

This series of articles describes four cases on how to deal with such tasks:

  • Case 1: You handle the task yourself or with minimal colleague help.
  • Case 2: You hire annotators and try to build a team yourself.
  • Case 3: You outsource the task to professionals.
  • Case 4: Crowdsourcing.

Case 1: You Annotate Dataset Yourself or with Minimal Help from Colleagues

A small disclaimer: annotating solo is fine for small amounts of data but not big datasets. Here is why.

The Data Annotation Speed

The scientist must select useful frames from the extensive video collection for the robotics project. Accurate and precise polygon annotations will be used to label objects in the images. Let’s assume, according to the data annotation specification, 40 classes will be annotated using polygons (using semantic segmentation), with each instance annotated separately. A basic description of how to annotate is necessary, noting that the complete specification can take 30–50 pages and will include detailed instructions on annotating each class correctly with excellent and bad examples and corner cases. Writing a specification also requires time estimated in days and weeks.

The time required to annotate an object using polygons can vary depending on several factors, including the complexity and size of the object, the clarity of the image, and the expertise of the annotator.

On average, it can take anywhere from a few seconds to several minutes per object. Here are some general estimates:

  • Simple Object (e.g., a rectangular object): 5–10 seconds
  • Moderately Complex Object (e.g., a car): 30–60 seconds
  • Highly Complex Object (e.g., a human with detailed limb annotations): 1–3 minutes or more

Detailed polygon annotation can take significantly longer for precise tasks, especially for objects with intricate details and irregular shapes. If the quality requirements permit, AI tools like the Segment Anything Model can speed up the annotation process. However, these models often lack the precision needed for some tasks and require extensive manual corrections.

Let’s focus on the task at hand. We are dealing with an image of a room scattered with small objects. A skilled annotator can typically label each object in about 40–50 seconds. However, since our scientists do not perform annotations daily, the expected annotation speed, in our case, will be approximately 60 seconds (or 1 minute) per object.

The Data Annotation Cost

Now let’s talk about money and costs. It’s important to note that sometimes people think that annotating themselves is cheap because they do not account for their time, which is paid time unless the annotation is done outside of working hours. Let’s assume the robotics engineer is from the USA, and annotation is done during working hours. We will research job postings on Indeed, the well-known job aggregator site, and then check the average salary before taxes. The average wage calculated from the data provided is approximately $42 per hour (for June 2024).

All that’s left is to add the cost of the annotation tool. This cost can be zero if the scientist is tech-savvy and can install a self-hosted solution. If you plan to annotate yourself or ask colleagues to help you so you can annotate as a small team, in the case of CVAT, it will cost you $33 per seat.

Here is a list of the most popular open-source data annotation tools you can use for free*.

Remember that even free tools require time and resources to set up and support, and time is money. So, while we say “free,” you can download and install the tool, but the rest depends on your time, expertise, and effort (and how much of your paid time will be spent on this).

Let’s Sum It Up

First, we calculate the total amount of hours that the scientist will need to annotate all objects:

  • 2,300,000 objects x 60 seconds = 138,000,000 seconds.
  • 138,000,000 seconds / 3,600 = 38,333 hours (rounded to the whole number).

In the best-case scenario, it will take:

  • 4,792 working days
  • 240 months
  • or 20 years of one person

If the scientist drops all other duties and dedicates 8 hours daily solely to annotation.

The cost of the annotation will be:

38,333 hours * $42 = $1,609,986 + cost of the tool on the top.

Note that the described approach lacks scalability. Maintaining the dataset and addressing any emerging issues will be necessary in the future. Additionally, deployment in a production environment typically requires a significantly larger volume of data. Of course, engineers can ask colleagues to help, which may reduce the time and not the cost.

The Quality Assurance Stage

An automated system known as a “Honeypot” can be used to ensure quality assurance when annotating data independently.

The Honeypot method is cost-effective but time-consuming. It involves setting aside approximately 3% of your dataset, or about 3,000 images from a set of 100,000, specifically for quality checks.

You must use a previously created specification outlining your annotation requirements and standards. Annotate this selected subset of images yourself to serve as a benchmark. While this method saves time in the long run, it still requires an initial investment of time and resources to set up and perform these annotations, which translates to a monetary cost.

And that’s it. Feel free to leave any comments on our social networks, and we’ll gladly respond. In our next update, we will answer the question of how much an in-house annotation team costs.

--

--