We can’t have nice things (yet).

Professionally, for the past year or so, I’ve been talking about data. Not in the Brent Spiner way, although I am always up for some TNG chat.

More in terms of helping large organisations to think about their business, their staff and their data. I’ve done this for A Large Charity and A Central Government Department That Is To Do With Digital. What it’s led me to is a question about what data promises us, what it can help us achieve and some of the limitations it starts to present us with.

Our bosses want data. Not that data. Not CSVs and queries and endless bloody numbers: they want “actionable data” to make decisions on (an aside: actionable data is usually “our customers get stuck on stage 4 so we should fiddle with that, anything more complicated is interpretation and arguments). That part is crucial. Decisions. I’ve been looking at data that would allow decisions around children and decisions around government spending. There are reasons why these decisions are difficult, and they’re only half to do with the data.

It boils down to people and the messiness that that brings.

Overpromising of technology

Have you heard someone high up in your organisation or one you (used to) respect spend their time on some manel talking about robotic automation or AI or ML? Me too. The weird thing is that I’ve never heard them talking about the number of temps you need to hire to clean the data first. All of these c-suite boner outputs sound great, but are built on the back of the biggest vapourware product ever: clean data.

Incomplete records

For a start, are you sure you’ve got everything and that it’s all filled out ok? Data in organisations has a real principle of subsidiarity problem: the data entry is often done by people at entry level jobs who might not have a stake in the ongoing clarity of your data. Sometimes data is inputted by a transaction happening or by the user registering their own account. But potentially things might be by hand (gift aid receipts when you go to a National Trust property) or over the phone if you’re talking to customer services. Are you sure that your records contain no/few dupes, that your records are entirely accurate and, for that matter, are they machine readable?

Different realities

This might be a grand term, but it relies on how you input “nothing”. In my experience, I have seen it represented by “n/a”, “na”, “N-A”, “0”, “ “, “ “, “-“, “_” and “this field left intentionally blank”. Open text fields are a real problem for consistent input. Of course in my example, with some curation you can quickly chalk all of those answers up to empty/nothing, but what if you’re putting in human readable data like “what did the child say on the phone last time you spoke to them” or “what’s your delivery confidence and risk register for this project”. Both of these are qualitative assessments, so to reduce them to quant loses the bit that contains their true business value. Again, you can run some sort of ML over them and find themes, but this is to build a taxonomy of issues based on the past rather than the future. You can’t describe the governmental experience of EU Exit by extrapolating the last five years of Whitehall email.

But we also have to remember that in an organisation, there is no guarantee that the same words mean the same things to everybody. A mental model of data is not imposed by a schema. You cannot change people’s experiential models by sending them an Excel sheet with mandatory fields. It needs more than mandate.

Archives and power

So, I don’t think we are at the point where we can have infinite shiny dashboards that point us with unerring business intelligence into the future. And nor should we. The idea that data exists outside of the social world is a farce. The data we choose to collect, release, how we structure it, where we keep it, who can request it, who can aggregate it and how the qualitative data of words and impressions and thoughts can be appended or stripped from it is vital to how we build our future. And it is this question that we need to answer. We can’t have data curated by machines, it has to be people.

It is already people now (but often hidden and wished away), and the future will need more people and more oversight. So the question becomes whether we should think about who those people are, whether they have a code of practice that goes beyond an institutions’ individual preference for its relationship with data above and beyond the legal requirements and when we talk about data whether we could imagine seeing it as poetry, art and diaries rather than roads, oil and APIs.

Like what you read? Give Alex a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.