Crowdsourcing: it’s a complex ecosystem!

At last week’s Moore-Sloan Research Lunch Seminar, Djellel Difallah explains his data-driven approach to analyzing platforms like Amazon Mechanical Turk (Mturk) and Wikidata.

Initially, the name ‘crowdsourcing’ conjures up a vague image of a random pool of anonymous people who are simply willing to perform tasks voluntarily, or for monetary rewards.

In reality, however, crowdsourcing is also a complex ecosystem underpinned by particular patterns — ones that Djellel Difallah, a CDS Moore-Sloan Data Science Fellow, is trying to understand.

Djellel Difallah, NYU Center for Data Science Moore Sloan Fellow (2017)

At last week’s Research Lunch Seminar, Difallah explained how he has been working with Panos Ipeiritos, a professor at CDS and NYU Stern, to use data science for studying underlying dynamics like supply and demand on crowdsourcing platforms.

For example, after analyzing data about the number of requested human intelligence tasks on Amazon Mechanical Turk (Mturk) from 2009–2017, Difallah found that two factors which influenced whether a task would be completed were the number of assignments available within a single task, and how long ago the task batch was uploaded.

The last factor — the “freshness” of a task batch — is important because, as Difallah explained, by default the platform organizes the tasks according to the time they were uploaded, with the most recent tasks at the top.

Difallah also found that MTurk’s participants as a whole are more driven to a batch when there are more tasks available. As the tasks get completed, the batches get smaller, older, and ultimately suffer from stagnation effects.

Unearthing these patterns could eventually enable us to predict when a task batch will be completed on MTurk, which will be a big help to researchers who rely on crowdsourcing.

Presently, Difallah is also investigating editing patterns on Wikidata, a collaboratively-edited knowledge base, as well as modeling the skillset profiles of those who populate the platform.

More broadly, Difallah added, analyzing platforms like MTurk and Wikidata as intricate ecosystems will help us understand the behavioral patterns of each platform’s community, improve how they operate, and increase their efficiency.

For more, see Difallah’s co-authored paper with researchers in the UK and Switzerland.

by Cherrie Kwok