The subtle art of data gathering with Warm Data

Published in

Phoensight

6 min readNov 13, 2019

Copyright © Audrey Lobo-Pulo (CC BY-NC-SA), 2019

Big data and data science have become ubiquitous in our society, and are not just limited to technology — it’s also pervasive in the share market, advertising, banking, telecommunications, the health sector — it’s everywhere we collect data. Collecting more data for better insights and being data-driven is the new economic mantra.

And while data has been referred to as the new oil, gold and currency, what remains unquestionable is that data is growing prolifically. Businesses are acquiring data assets in the hope that more information will bring deeper insights and create bigger business impacts. Research suggests that US firms are likely to have spent over $19 billion on third party and data-use solutions in 2018 alone.

We are currently witnessing a race to accumulate as much data as possible to find out as much as possible — in the hope that we can find better solutions. Information is powerful. Yet, there is a leap to be made from sense-making using data to finding solutions for problems.

It’s almost tempting to think that a ‘complete set’ of harvested data is all you really need to find the optimal solution to any problem…

And there we might stop, if it weren’t for one interesting observation — data, no matter how it’s structured, is a representation or metric of something that is perceived, measured or expressed. Now this might sound rather meta, but it’s important for us to reflect a little on this.

As sophisticated as our modern instruments of measurements have become, data that is ‘captured’ is limited by the tools that create it. And this says something about not just the tool, but the act of measurement itself. Measurement, by virtue of its intended objectivity, tends to de-contextualise information.

Over the years it’s been fascinating to observe how the verbs used to create datasets have evolved. While in the past we mostly measured data, today data is collected, gathered and harvested. And while this reflects our progress in increasing computational storage capacity, there is an underlying sense that this data is being taken out of its natural environment — that data is alive, vibrant and dynamic within its many ecological contexts until it is harvested.

If data is harvested with care, it’s often accompanied by ‘meta-data’. Meta-data aims to provide information about the data, such as how and when it was created, the structure or administrative information amongst other attributes. It’s almost like the ‘packaging’ that comes with your freshly harvested data…

But what about the information that is continually evolving within complex systems across numerous entangled contexts before the ‘data-harvest’? Information that is held between the measurements and is coloured through multiple contexts? How does one even come close to capturing that?

Is it even possible?

The International Bateson Institute (IBI), a non-profit foundation for trans-contextual research, has recently pioneered the concept of “warm data” — information on the interrelationships and interdependencies between elements within complex systems. What’s interesting here is how this warm data or information influences elements within the system, how it interacts and evolves, and what we’re able to learn prior to harvesting the data…

Why Big data needs ‘Warm Data’

“To demand that artificial intelligence be humanlike is the same flawed logic as demanding that artificial flying be birdlike…” — Kevin Kelly

Machine learning and Artificial Intelligence (AI) algorithms are the new agents in our digital ecosystems, and are trained on the data we provide them. The more data that’s available to them, the “better” they perform — and the urge for improved performance creates an even greater demand for data.

And so the expedition for even more granular data continues, and data mashing across thousands of variables and datasets become part of the data scientist’s journey.

Yet, these algorithms are also limited by the data!

While data scientists are navigating through uncharted territory on data privacy, ethics and regulatory concerns, they do so because of the data they work with and the implications of decision-making as a result of this data. For example, AI algorithms may be rendered unacceptable due to biases in the training dataset — and the quality, provenance and nature of the data are often brought into question.

While artificial intelligence and predictive modelling are transforming our society, for example, the health sector through early disease diagnoses, we are yet to confidently use machines to apply solutions in different contexts. And this is where the warm data that humans so deftly ‘munge’, ‘process’, ‘assimilate’, ‘prioritise’ and ‘contextualise’ can be the key to discovering new possibilities.

Our subjectivity and life experiences (learnings from participating within our ecosysytems) diversifies the information we bring to a problem. Each of us holds a plethora of information, which govern our behaviour and help us make sense of our systems.

Warm data is invaluable, and data science has recognised the need for it — by referring to ‘domain expertise’ or ‘tacit knowledge’. But warm data is so much more than this — it is trans-contextual and crosses many domains, and always evolving!

To neglect this information or assume it’s contained within ‘higher order effects in a modelling process’ is to potentially overlook a key piece required for the solution — it’s demoting information that’s hard to quantify, but may nevertheless be hugely important. Attempting to model it may be likened to searching for endogenous variables in the hope they might proxy for the secret sauce!

But warm data is not as elusive as it may sound, and is also held within our human complexities as opposed to somewhere in the cloud…

Warm data may be found at the ‘interfaces’ where entities interact within their systems — and ‘warm data labs’ are useful processes to work with such information. Developed by the IBI, warm data labs are group processes where people are able to work with complex issues across multiple contexts for a deeper understanding of the system. By doing so, warm data labs create the conditions required for building problem empathy and shifts in human perception for new insights and possibilities.

Where data-driven methodologies are providing new insights based on data that’s been previously harvested, warm data labs allow for warm data insights to inform data contextualising — which in turn also informs data harvesting methods and practices. Having a better understanding of the complexity means we can be more discerning about the data we harvest and the implications of data-driven decision making. We could produce better algorithms and solutions that are more effective.

The irony in all this is that although humans are becoming increasingly reliant on big data and artificial intelligence, to really tackle some of the most challenging issues in our society we need to synergize this approach with warm data and human intelligence!

“The way in which sense is made… will differ from how that same sense-making fits tomorrow”
— Nora Bateson

Phoensight is an international consultancy dedicated to supporting the interrelationships between people, public policy and technology, and is accredited by the International Bateson Institute to conduct Warm Data Labs.

The subtle art of data gathering with Warm Data

Why Big data needs ‘Warm Data’

Written by Audrey Lobo-Pulo