Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

How to Build a Scalable Data-Annotation Strategy

8 min readDec 14, 2021

--

Photo by Tom Wilson on Unsplash

As you may know, data science teams spend about 80% of their time creating and managing training data. The usual issues are often related to poor in-house tooling, labeling re-work, finding the needed data, and the difficulties associated with collaborating and iterating on distributed teams’ data.

Frequent workflow changes, large-volume datasets, and a lack of proper data training workflow can hinder a company’s development. These issues worsen when the company grows too quickly, as is often the case with startups, regardless of the industry.

A perfect example of such a need for a scalable training-data strategy comes from the highly-competitive autonomous vehicles industry. Computer vision applied to self-driving vehicles is a complex and competitive market. Due to the complexity, the definition and scope of high-quality training data change frequently. If your team cannot adapt (including your ability to annotate data), customer dissatisfaction can cost you the whole business.

Identifying The Right Data Annotation Strategy

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Alexandre Gonfalonieri
Alexandre Gonfalonieri

Written by Alexandre Gonfalonieri

AI Consultant — Working on Brain-computer interface and new AI business models — Support my writing: https://alexandregonfalonieri.medium.com/membership