Exploring the Costs of Outsourcing Annotation to a Data Labeling Service

Mariia Krasavina
CVAT.ai
Published in
5 min readAug 5, 2024

In the initial segments of this series, we delved into the expenses associated with self-annotating or using an internal team for image and video data. This installment focuses on the financial and resource commitments required to outsource data annotation to specialized services.

First, consider a hypothetical scenario: Picture a prominent robotics expert developing an intelligent home assistant that differentiates between clutter and valuable items within residential settings. Everyday household disarray might include everything from strewn toys and misplaced eyewear to pet hair and assorted debris. The envisioned robot is designed not only to clean effectively but also to help retrieve lost items. This could aid seniors by assisting them in managing their belongings more effectively and identifying a distinct market for such innovations.

As the project leader, your role involves steering a small, dedicated research team that has compiled a dataset of 100,000 images, each showcasing various domestic environments with items dispersed around. This dataset volume aligns with what is typically seen in robotics ventures, varying from a few thousand to millions of images. With each image containing an average of 23 items, your team faces the task of labeling about 2.3 million objects. This series aims to review different strategies for handling this extensive annotation workload, including DIY methods, building an in-house team, outsourcing, and employing crowdsourcing techniques.

Welcome to the third installment of our series, which examines the costs associated with outsourcing data annotation to meet the needs of our hypothetical scientist.

Case 3: Delegating to Expert Services

Starting with a brief overview, it’s important to note that while all data labeling firms tend to follow a basic operational model, nuances in their approach can significantly affect the outcome of their services. It’s the small details that matter, and CVAT.ai is no exception. Should the referenced scientist approach us, we first gather the necessary information from him and his team.

Time and Phases of the Workflow

Here’s a detailed breakdown of the workflow, segmented by phases and accompanied by time estimates. This could vary across different companies, and our description is based on firsthand experience.

Our company prides itself on extensive experience and stands as a market leader. We offer data labeling services and manage our own annotation platform. This flexibility allows us to continually refine CVAT.ai to enhance both the annotation and validation stages. Our clients benefit from the ability to use this platform internally and expand on annotations seamlessly.

Are you curious about how it works? Try annotating a sample for free, like countless other data scientists worldwide.

But let’s focus on the various stages of annotation.

Stage 1: Annotation Proof of Concept (PoC)

  • If needed, we start by signing a Non-Disclosure Agreement to secure the client’s data.
  • We require a sample of actual data (50–100 images or 1–2 videos) to evaluate and determine the appropriate annotation strategy.
  • We work closely with the client to finalize annotation specifications, address nuances, and establish quality standards.
  • After completing these preliminary steps, we develop a PoC and offer detailed project costs and timelines.

Stage 2: Documentation & Preparation

  • Post-PoC, we propose the optimal annotation method, refine initial specs, and set quality and timeline expectations.
  • We prepare all necessary documentation and agreements outlining our collaborative terms and payment conditions for the client’s review.
  • We are responsible for training the annotation team, and a dedicated manager oversees all project-related communications and operations.

Stage 3: Annotation

  • This phase involves meticulous adherence to the set specifications, although we remain adaptable to any changes that might arise.
  • For large projects, we recommend incremental data delivery, enabling ongoing client experiments and adjustments.
  • Regular client feedback is sought to fine-tune the documentation and processes, ensuring alignment with the project’s final goals.

Stage 4: Validation

  • We are committed to delivering high-quality results conducting thorough manual and automated quality checks to meet established benchmarks.
  • Final validations and comprehensive quality reports are compiled and delivered within a set timeframe.

Stage 5: Acceptance

  • This final phase is when the client receives and reviews the completed work, processes payments, and provides feedback on the service quality.

Following our previous article, in case there are no client delays and unexpected events, the whole process for the described project will take approximately 50 work days, 10 weeks, or 2.3 months. Of course, it depends on each case’s requirements and circumstances.

By entrusting us with your project, you commission a high-quality service with a pre-defined and documented guaranteed outcome. The client’s role is limited to observing the process, accepting recommended changes from our side, reviewing the delivered data, and providing feedback on the results of the validated work. We take on all internal processes and guarantee the project’s quality and timely delivery.

Assessing Labeling Costs

Determining the exact cost of data annotation can be complex, as it largely depends on the data volume, required quality, annotation type, and project deadlines. We estimate the cost of annotating 2,300,000 objects or 100,000 images using available industry data. Thus, we can only rely on fragments of information from sources like KILI Technology or Mindkosh to make our estimates. The number will usually be above $300,000 because semantic segmentation, used for this task, is one of the most expensive annotation types for now.

At CVAT.ai, our pricing strategy is versatile, based on a per-object, per-image/video, or per-hour basis, making it suitable for various project scopes and complexities. We apply 5% to 30% discounts for extensive projects, reflecting our commitment to fostering long-term client relationships. This could bring the total cost for a large-scale annotation project down to around $225,400, though prices may vary with specific project requirements.

In conclusion, outsourcing your data annotation to a professional service like CVAT.ai offers significant efficiency, quality, and project management advantages. While costs may vary, our flexible pricing and discount options ensure we provide high-quality service at competitive rates.

Next Steps?

Ready to label data with CVAT.ai? Email us: labeling@cvat.ai!

Ensure you have all the necessary information — download our detailed takeaway now!

--

--