Data as Labor

Alethea AI Official Announcements
Published in
8 min readNov 19, 2018

Data As Labor: Rethinking Jobs In The Information Age.

“The winning formula is simple: smart machine + human intelligence = clever system” (Floridi, 2014).


The story of how most tech companies generate profit has been a straightforward one: “users are unwaged labourers who produce goods (data and content) that are then taken and sold by the companies to advertisers and other interested parties”. In turn, this data is mainly fed to Artificial Intelligence (AI) systems that deliver services, improve production, drive innovation, etc. In fact, this economic model (digital economy) is probably the main source of innovation today, delivering massive ‘surplus to users and [being] “free” (at point of use) to users’.

Paradoxically, while these systems rely on the quantity and quality of data generated by humans they are also displacing workers at an unsettling rate; a recent study shows that AI could automate around 50% of jobs in 10 to 20 years. Another striking figure is that even though the combined revenues of Detroit’s “Big 3” (GM, Ford and Chrystler) “were almost identical of to those of Silicon Villey’s “Big 3” (Google, Apple, Facebook) in 2014, the latter had nine times fewer employees and worth thirty times more on the stock market”. This has prompted many economists to sound the alarms about the current distortions in market power, and call for new approaches. This article will attempt to summarise an alternative economic/social paradigm.

Current state of affairs

One of the consequences of an Information Communication Technology (ICT) dominated world is that humans are needed as a ‘component of the overall mechanism’. As Luciano Floridi puts it in his book The Fourth Revolution, sometimes our ICTs need to understand and interpret what is happening, so they need semantic engines like us to do the job. It is indeed the trend that we are seeing today. Amazon introduced such human-based computation application in 2005 with Amazon Mechanical Turk which Amazon describes as “Artificial Artificial Intelligence”. The application enables any US-based ‘requesters’ to harness human intelligence by asking anyone in the world to perform tasks that are still too complicated for AI to train them. This includes tagging categories of content on videos, transcribing recordings, and answering surveys. In this example, the workers offering their services are remunerated. Conversely, human intelligence is used for free to train AIs through the infamous reCAPTCHA — Completely Automated Public Turing test to tell Computers and Humans Apart. Most of us have been subjected to these annoying tests when logging in to websites like Binance, for example, to prove that we are humans. This simple task helps digitize content that might be too hard for machines to read or discern. In 2013, ‘machines have used more than 1 billion users to digitize books in this way […] for an estimated saving (if the work had been outsourced to human workforce) of approximately $500 million a year’.

Clearly, our silent work is generating huge revenues for companies, so why are we not remunerated? Should we not be entitled to a share in the economic value of the production that our data underlies? In response to these questions, the argument of most tech companies goes something like this: if you want to use our services for “free”, we will gather any data generated through your use of our services’ affordances. This comes mainly from allowing platforms to access data naturally during our consumption process: playing a game, translating phrases, etc. However, recent revelations about the value and power of our personal data, such as from the Cambridge Analytica scandal, have kindled debates about the social contract that we have tacitly agreed to with tech companies. If our personal data can help a political party win an election, should we not have more bargaining power when exchanging that data, or at least have a say? Is our personal data on Facebook really only worth free access to Facebook? The artist, Jennifer Lyn Morone, more crudely calls it ‘Data slavery’. In this article we will posit the debate in different terms, namely, data labor and data capital.

Credit: Jennifer Lyn Morone

Data as capital

In a recent article by The Economist, its author asserted that ‘data is the new oil’. This statement is incorrect but fits well into the data as capital narrative. Why is it incorrect? To keep it short, data is not a natural resource; this should inform our thinking about data. Data today is treated as capital and not as labor and this might be the cause of the aforementioned problems. Data as capital entails viewing data as naturally emanating from consumption and to be collected, it implies as Ibarra et al. (2017) explain “channeling pay-offs from data to AI companies and platforms to encourage entrepreneurship and innovation […] [and seeing] the online social contract as free services in exchange for prevalent surveillance”. Taken to its logical extent, the ‘data as capital’ paradigm seems to lead either to either regrouping all labor into the few sectors where AI would not be able to produce similar or better results or to one where Universal Basic Income will satisfy unemployed labor. It is not difficult to observe how treating user-generated data as capital exacerbates inequality. The share of national income going to the low-skilled workers is in peril. If we are to avoid a crisis of technological unemployment we ought to take this question seriously.

It is important to note at this point that for most of human history ‘workers were not properly compensated for labour’ as Glen Weyl notes in “Radical Markets” and the current state of affairs seems to rhyme quite well with this historical context. We are currently in an state of data monopsony or inverted monopoly where few buyers possess most of the market. Indeed, it the case for a few “buyers of data” that have enjoyed significant network effects early on and are now “keep input prices low and buy less of it”. One potential solution to this is to treat data as labor.

Data as labor

In short, by treating data as labor AI could be viewed as a ‘production technology enhancing labor productivity and creating a new class of ‘data jobs’” thus feeding with more and better data and reducing rising inequalities. Data as labor might also contribute to rebalancing power in the market for data. The table here under does a good job in boiling down the essential distinctions between the two aforementioned paradigms.

While economists have highlighted that past technological disruptions tended to leave labor’s share of national income stable or even bigger, others have pointed out ‘recent secular declines in labor’s share which belie its universal stability’. Treating data as labor would entail treating data as user’s possessions. It would channel the aforementioned pay-offs ‘to individual users to encourage increased quality and quantity of data’. Along the need to have more data comes the need to have better quality of data for AI systems, and this need will only grow as these systems improve. It is no secret that incentivised workers will produce higher quality goods; engineers will gather better quality data from experts and individuals in the right context that from sucking up free data from largely irrelevant individuals. Amazon Mechanical Turk, for example, would most certainly deliver better quality data that if the same information was gathered for free through surveillance. In other words, as Byrne et al. put it “purely free data economy acts as a drag on productivity growth that continues to lag worldwide (Byrne et al., 2016) despite bold hopes for AI’s potential.

Data as labor also requires public institutions to ‘check the ability of data platforms to exploit monopsony power over data providers and ensure a fair and vibrant market for data labor.’ These institutions could also be assisted by organised data labor as seen already in some countries. The notion of labor unions varies in relevance whether you talk to an American or a French, but that very mechanism where labor unionises in order to negotiate with their employer could be transposed to data labor. We could foresee “data labor unions” which would bring tech companies on the discussion table over the monetisation of their contribution as data generators.

Naturally, the postulate of this article is limited by the fact that only two models were provided, but this simplified approach can help us frame our discussions and ask the right questions regarding the monetisation of data. A crucial, yet missing, addition to the “data as labor” paradigm would be an educated labor force. Realising the status that Personal Data (PD) holds legally, normalising extended data rights, acknowledging the full extent of PD’s impact on social stratification, and capturing the ways that it can truly become ‘personal’ data again, are all necessary steps still missing from the education of today’s “Onlife” worker -seamlessly living online and offline concurrently. This education does not only come from the usual institutions, the DAIA, among others, has served this purpose and aims to provide anyone with the necessary knowledge to grow as a responsible business but also as enlightened individuals in the information age.

Erik Brynjolfsson, Director of the MIT Initiative on the Digital Economy, has envisioned a so-called ‘‘digital Athens’ ’ for the future of humankind with AI. He makes the analogy with the Athenian citizens in 500 B.C that enjoyed lives of leisure resulting, inter alia, in art, democratic participation, and philosophy, thanks to the labor of their slaves. This optimistic scenario aside, as long as our data remains trapped within the confines of today’s tech and data oligopolies, it is unlikely that the average citizen in the West will be able to enjoy the real fruits of their labor i.e. data.

How can you get involved?

SingularityNET has a passionate and talented community which you can connect with by visiting our Community Forum. Feel free to say hello and to introduce yourself here. Exchange ideas, chat with others and share your knowledge with us and other community members on our forum.

We are proud of our developers and researchers that are actively publishing their research for the benefit of the community; you can read the research here.

For any additional information, please refer to our roadmaps and subscribe to our newsletter to stay informed about all of our developments.