The High Costs of Cheap Text Labeling Software

Nick Adams, Ph.D.
7 min readAug 9, 2019

--

If you’re text labeling software doesn’t manage your data and team for you, you may need to budget an extra $120,000 per year and plan for a longer project.

I’m pretty suspicious of anyone claiming that “there are two kinds of people in the world…” As a sociologist trained to understand a wide and nuanced range of human behaviors and characteristics, I usually stop listening right then and there, or at least apply a heavy filter. But as we’ve fielded inquiries about TagWorks, we are finding that two types of potential customers keep walking through our (virtual) doors. The first is pretty much sold the moment they come to us. They want to ensure the product is as advertised, then they’re ready to buy. The second type is shopping around, kicking tires. And whether they express it directly or not, they’re often trying to understand why TagWorks doesn’t cost $200 or less like some of the other text labeling software out there.

What differentiates these two types of people? It’s not that the second set are cheapskates. It’s not that the first set are so much smarter (or better looking, and more personable, ha!). It’s not even that they’ve read more of these blogposts, much as I hope they’re popular and informative. The one difference between the people who come to us ready to buy, and the people who come ready to haggle is that the first set of people have already tried to manage a text labeling project using the cheaper software.

It’s not that the cheaper software is so terribly awful. Traditional CAQDAS (computer assisted qualitative data analysis software) and the newer online taggers like LightTag and Doccano work as advertised. The CAQDAS are useful for smaller projects when you just need to tag a few hundred documents by yourself or with a collaborator. And the online taggers are great when you just need to apply a single layer of tags, picking out named entities, or marking them as ‘in’ or ‘out’ of some category. But they were never designed for big, complex language annotation jobs. They were designed so that you (and maybe a collaborator or two) could label a small set of documents more efficiently than you could using the highlighting and commenting features available in Microsoft Word.

The people who come to us ready to buy have already tried these tools and tried to adapt them for projects that are larger and more complex than they can handle. They got the software installed onto all their annotators’ machines. They distributed the raw documents to all of them. They figured out how to distribute the codebook and get it loaded into the software. Then they trained their team members, meeting a dozen or more times to get everyone annotating the same way. Then, they did a trial run and tried to check up on all the annotators’ work. Some of those annotation software packages report reliability metrics. Some don’t. So maybe they hired a data scientists to go into the command line with some scripts to run the numbers. Few of the cheaper software packages show annotators’ output side-by-side so the project manager can see how people are tagging differently. PMs have to navigate through menus to download annotators’ work individually and then try to understand who is out of alignment and how. That takes hours. Then they have to call a meeting to deliver feedback. Then, they repeat all of this again, and again, until the team is starting to actually work pretty well together.

By then, the semester is over (if the team is working in a University context) and half the team is leaving to work on other projects and take part in other activities. Or, a couple teammates get reassigned or go on family leave (if it’s a corporate context). New annotators need to be trained, and the remaining annotators are growing bored. Raw data needs to be distributed again. And as data comes in, it still has to be reviewed for quality. But that has to wait till the new annotators are trained on the codebook. There’s only so much time in a day. But first they need to get training on how to use the tool. That takes time. And they need to get moving. Meanwhile, Robert hasn’t sent in his results from last week. He got sidetracked and needs a reminder. And Susan is having trouble loading the right files. She missed the most recent meeting where the team discussed that. The new data can’t get reviewed until its all in. And there’s a new module the team needs to be trained on. The inter-rater reliability scores are below expectations, and a few hours of manual inspection shows that Beth is the only one who fully adopted the new procedures, though Catherine’s doing it mostly right. The others need to re-do their work if the inter-rater reliability scores are going to be high enough to publish. But there’s no use asking them to re-do that now. It would just kill morale.

Nagging everyone to get their work in is taking its toll. John’s looking for the exits and he needs to be careful about the sort of attitude he’s giving off in meetings. The new batch of data actually isn’t half bad, but how did it take so long? At this rate, the project is going to drag on for another three and a half years! And no other work is getting done. Managing this one annotation project has become a full time job. The semester/season is closing out in only a month, so everyone is starting to focus on their other deadlines. There’s likely to be more turnover after the break, which means more training while keeping the others moving, more data review and refinement, more coaching and managing team morale, while somehow maintaining sanity. And the team’s barely through 12% of the project. This whole process needs to be repeated eight more times to get all the layers of annotation applied to all the documents.

That’s the moment it becomes apparent: those tools were never built for large, complex annotation jobs. And it doesn’t matter if they’re only $150 per license. It doesn’t matter if they’re free. Even if the company were paying you to use them, they would have to pay you enough to hire a project manager and data science consultant for three years before you’d want to try them again.

All the management of people, data, codebooks, training, task delegation, and feedback compounds with those tools. It just gets worse and worse until you give up, or give in and accept thin data, or small data, or shoddy data. (Learn how to avoid these and other pitfalls, here.) But with Tagworks, your management tasks scale easily because they software was designed — from the beginning — to ensure they scale easily. We created TagWorks only after we had gone through the herculean management struggles associated with the other tools. We knew there had to be a better way, and we invented it.

We can give you more details in person. (We don’t want our secret sauce sitting in a blog post on the Internet.) But TagWorks was designed so that you (or maybe your assistant if you like) can manage an entire annotation project from soup to nuts without it ever being anyone’s full time job. It was designed so that you don’t have to train your annotators face to face. You don’t have to review all of their work with a fine-toothed comb through some arduous process. You don’t have to hire or become a data scientist (like some tools require) so that you can measure the inter-rater reliability of your annotation prompts. And you don’t need to try to manage the delegation and collection of tasks via emails and late night phone calls to your colleagues and teammates. Not only that, instead of experiencing a bunch of management headaches with a team of a dozen or fewer annotators, TagWorks reduces your management load while allowing you to enlist the help of hundreds or even thousands of people simultaneously. So your project, even if it’s extremely large and complex, can be completed in months not years.

What does all this mean? It means TagWorks could save you well more than $100,000 in project management costs, not to mention the opportunity costs that come with demoting yourself to full-time annotation PM (if you were to go that route). It means you can gather every bit of information from every single one of your documents without every worrying about whether you can actually recruit, motivate, and train enough annotators to do it. It means you’re limited only by your imagination, not your workforce, and not your tools.

Ruminate on that. Dream a little. Think big. Go deep. Then, give us a call. And know that we’ll be saving you a lot more than time and money. We’ll be saving you from the regret about what could have been.

For a free consultation, shoot us an email at office@thusly.co, or visit us at https://tag.works.

Nick Adams is an expert in social science methods and natural language processing, and the CEO of Thusly Inc., which provides TagWorks as a service. He holds a doctorate in sociology from the University of California, Berkeley and is the founder and Chief Scientist of the Goodly Labs, a tech for social good non-profit based in Oakland, CA.

--

--

Nick Adams, Ph.D.

Here to help you take full advantage of your organization’s data and expertise.