Provocation #2: Data creation as communication, not extraction

PROVOCATIONS
6 min readApr 12, 2023

--

A few years ago, pundits and influential magazines promoted the idea that ‘data is the new oil’. This metaphor mobilised an epistemology that portrayed data production as a one-way transmission of information from grassroots to Big Tech — a new resource captured to extract value and profit.

The extractive imaginary contrasts with the experiences of many grassroots groups who deal with data in working towards social justice. Social movements and local communities are embarking on their own journeys to produce data, charting alternative horizons and methods. Whether in the cases of documenting systemic feminicides in Mexico or citizen-gathered air quality data, turning our focus to grassroots data production shows that extraction is not the only possible data plot.

So how else might we imagine digital data creation to open up new practices and possibilities?

We provoke by shifting decisively away from the extraction model to a model of data creation as communication. In proposing communication as an alternative data epistemology, we advocate for approaching data creation not as a monological interaction (as the extraction model does), but as a dialogue and as a ritual, i.e., the sharing of meaning as well as a means of participating in society.

Data creation as communication highlights that data-gathering and analysis are both subjective practices and communicative practices. This communication, at its best, strives for greater egalitarianism — in terms of norms, epistemologies and control over representation.

Data creation is about process as well as product. It is iterative, and it is fluid.

In this Provocation, we propose three epistemic shifts concerning how we understand data and explain how imagining and practising data as communication makes space for pluralism and solidarity.

Shift one: grassroots data without romanticisation

Looking at the grassroots level provides an excellent vantage point from which to put into practice non-extractive approaches to data creation.

Whereas Big Tech companies are motivated by profit-making, grassroots organisations are obtaining inspiration from a broader range of horizons and tend to orient their work in solidarity with other groups, leaving space for communicative approaches to flourish.

However, the romanticisation of grassroots data is not the way forward either. Orthodox binary frameworks tend to posit Big Tech and the grassroots respectively as malevolent and pure, as powerful and powerless. Before rushing into assuming that grassroots data creation always challenges the extractive imperative, praxis suggests that there are important nuances to consider.

Grassroots data should not be romanticised, as grassroots actors might purposely adopt dominant frameworks and practices to achieve their goals. Conversely, outside actors should not rush to label this data as obedient without considering the needs and visions underpinning the data’s generation.

Helena Suárez, a feminist scholar and activist, has described strategic datafication, a position that allows activists to benefit from the dominant epistemological position of quantitative data in order to advance social justice. In some cases, actors might have to bracket off relevant questions in order to comply with the data producers’ expectations.

Ruha Benjamin and Sandra Ristovska have respectively explored strategic exposure and strategic witnessing, both of which speak to the adoption of pragmatic stances vis-á-vis a datafied world.

In sum, grassroots data creation offers a unique inspiration to reject extraction and embrace communication. In practice, however, data is not necessarily communicative because of being grassroots. Strategic forms of obedience and disobedience can be put into play in the search for social justice.

Shift two: epistemic pluralism, not erasure

Epistemic pluralism is the way of the worlds — but so, unfortunately, is epistemic injustice. The eclipse of many ways of knowing by a dominant way of knowing is characteristic of extractive, monological data production.

This erasure may be collateral damage, when it arrives alongside the promises of science and progress. It may also be purposive, serving the expansion of neo-colonialism, surveillance capitalism, and patriarchy — in which case we can call it epistemicide.

One area in which this really matters is data collaborations between grassroots and institutions towards social and political change. Data extraction can lead to epistemic erasure when the only grassroots data that gets attention is the data that fits the institutional epistemology, and all other forms of knowledge are ignored.

A communication model of data creation, however, builds bridges between epistemologies that value both grassroots’ and institutions’ ways of knowing. These bridges, of course, acknowledge the power asymmetries between the actors involved.

For example, we can think of the data collaboration of human rights reporting, in which evidence from the grassroots is essential for institutional advocacy against human rights violations. The institutional epistemology here is often about establishing the who, what, where, when and why, and placing these facts in relation to the legal framework of human rights — privileged information not available to all.

Traditional human rights reporting practices — on-the-ground and face-to-face — connect civilian witnesses and human rights NGOs through communication that recognises there are many ways of knowing about the same thing and that allows for the exchange of solidarity and care.

As we mentioned, however, building pluralism and solidarity is neither fast nor efficient — which begins to matter when big data is posing a volume challenge for the human rights sector, as for so many others.

As a result, we are seeing the rise of potentially more extractive practices in the human rights sector, like the use of computer vision and machine learning to parse digital big data of conflicts filmed and shared by civilian witnesses. The Trojan horse of efficiency gains brings epistemic losses as these machine interventions cut out opportunities for human interaction.

Practitioners have to be wary of committing epistemic erasure by mistake.

Shift three: representation from the bottom up

Too often we think of data as self-evident, transparent, accurate and objective — facts or evidence ready for analysing and mining. However, data stands in for something else out there: it isn’t that thing. Data re-presents phenomena, packaged in crunchable bits and bytes.

Challenging data extraction requires attending to the politics of representation rather than shying away from them. A question arises on how to represent in communication, allowing those represented to have a voice and to challenge data about them.

Grassroots data work is about engaging the politics of representation with the people and organisations it relates to. It is about recognition, or being represented on your own terms, rather than detection, namely being represented on the terms of the representer.

In other words, data creation as communication starts with addressing representation from the bottom up.

One particularly worrying trend in the civic action and human rights space is that data often displaces human voices in the name of amplifying them. As we learnt from the need to be attentive and patient in designing tech to value human voices in the work of Africa’s Voices, bracketing off questions regarding agency, context and empowerment constitutes the first step for failure.

However, following the communication path is not the easiest one due to a tension at its core. Existing discourses, technologies and business models favour aggregation, automation and the pursuit of scale as ends in and of themselves. Ambiguities are elided and choices on representation are hidden: large-scale patterns, probabilities and speed are privileged.

Many tech tools available to activists are embedded with these extractive logics. Acknowledging this tension is a way of addressing the politics of representation in data creation.

Seeking to overcome the biases and distortions of dominant technology in data creation becomes a call to communicative action. We start by creating communicative space for questions such as who is involved in this data, how these groups’ voices are represented and what channels these groups have for contestation. These questions should be central for designers and programmers involved in data projects.

Due to the tension between representation and programmability, keeping these questions in mind will involve compromise and frustration. And compromise and frustration are part of praxis, which requires making pragmatic choices in a structurally unjust world precisely in order to change it.

From extraction to communication

By taking the novel approach of thinking of data creation as communication, our Provocation has shown ways that data creation based on communication can make spaces for pluralism in epistemology.

Many have argued that the extraction model is detrimental due to the exploitative political economy of power and profit behind it. We have shown that the extraction model’s deficits are multiplied by how it engenders epistemic erasure and quashes human interaction.

In arguing for a new model designing technologies for data creation as communication, we are making a discursive deviation to unsettle the orthodox understanding of data. Beyond making space for pluralism by focusing on data practices (communication) rather than products (oil), we also make space for solidarity and care in data creation — as we will explore further in a forthcoming Provocation.

Is there anything worth retaining from the overused and ill-consequenced extractive imaginary of data as the new oil? Perhaps one thing: the way this metaphor draws attention to the environmental consequences of data creation.

--

--

PROVOCATIONS

Rethinking tech with rights practitioners and civic activists. By the Centre of Governance and Human Rights at the University of Cambridge.