Resolvable vs. Irresolvable Disagreement: A Study on Worker Deliberation in Crowd Work

Mike Schaekermann
ACM CSCW
Published in
3 min readOct 5, 2018

This post summarizes our research paper about analyzing and resolving disagreements in classification microtasks through real-time group deliberation. The paper will be presented at the ACM Conference on Computer-Supported Cooperative Work and Social Computing on November 7th.

Classification of data objects (such as text documents and images) into categories is a fundamental task in crowdsourcing as well as many real-world practices. A common assumption in this context is that objects can be unambiguously classified into predefined categories, and that agreement among human annotators corresponds to high classification quality. Conversely, disagreements are often considered “noise in the signal” and, as a result, various methods have been developed to remove this noise.

Input, output and stages of our real-time deliberation workflow.

In our paper, we argue that not every disagreement is resolvable in and of itself and that many disagreements arise for legitimate reasons and thus carry valuable information. We conducted a crowdsourcing study to shed light on the circumstances under which inter-rater disagreement can be resolved.

For our study, we designed and implemented a new workflow for real-time group deliberation in classification microtasks that enables groups of crowdworkers to revisit disagreement cases through real-time discussion. As a result, groups either resolved their disagreements or declared them as irresolvable if not all group members could agree on one and the same answer after multiple rounds of deliberation.

We studied how disagreements can be resolved through group deliberation in both subjective tasks (left) and objective tasks (right).

One of the things that made our study unique is that we investigated the resolvability of disagreements in the context of both objective and subjective task types. We used a well-defined relation extraction task in which participants were asked to decide whether a certain relationship between a person and a place was expressed in a given sentence (e.g., “Nicolas Sarkozy lived in France”, “Pavarotti died in Modena”) as an example of an objective task. In our subjective task, participants were asked to decide whether a given product review was sarcastic or not.

We present empirical evidence suggesting that the resolvability of an ambiguous case depends on the reasons for and level of initial disagreement, the amount and quality of the deliberation activities, as well as the characteristics of the classification task and individual data object.

Finally, in support of open science and to enable other researchers to reproduce and extend our results, we publish a novel deliberation dataset including all original and revised classification decisions of our participants, and all discussion dialogues at:

https://github.com/crowd-deliberation/data

Contact mschaeke [at] uwaterloo [dot] ca with additional questions or comments on the study.

Full citation: Mike Schaekermann, Joslin Goh, Kate Larson, and Edith Law. 2018. Resolvable vs. Irresolvable Disagreement: A Study on Worker Deliberation in Crowd Work. Proc. ACM Hum.-Comput. Interact. 2, CSCW, Article 154 (November 2018), 19 pages. https://doi.org/10.1145/3274423

--

--