The Open Data Hubris

Your goal is social change. Your tool is data. Why do you insist on feeding it to people who don’t care?

During my (almost) 2 years as a volunteer designer at Hasadna, I realized there is a fundamental flaw with many open data projects — they think people care about open data. They don't.

Understanding politics is important. Understanding the economy is important as well. Both are extremely difficult. Most people were never trained to understand them, and are reluctant to invest the time necessary to learn. So why do you think people will care about your context-less open data? Many open data enthusiasts, especially in the realms of politics and economy, believe that the data they release to the world is so profoundly important that all people must find it deeply interesting. They imagine the average Joe, spending his Saturday morning drinking coffee and analyzing the federal budget or Wallmart’s corporate structure. This is an admirable vision, but there's still a long way to go before it can become a reality. If we're interested in influencing the present, we need a different strategy. We need to stop aiming for the average Joe.

Data vs. stories

Let's take a step back and think about how we usually learn about the world in our day-to-day lives. Most people have only a handful of topics that interest them personally. These usually include our friends and families, our work and a habit or two. These are fields of knowledge we consume in pure form. We engage with them directly, filter out the irrelevant signals, and mentally aggregate all the data we encounter into coherent stories. All of this happens automatically. If I ask you about your work or your family, you'll have no problem introducing me to these subjects. You'll start by giving a summarizing statement ("I am a marketing manager at a local ice cream chain, and I love every minute of it!"), and if I show any further interest, you'll dive in with lots of facts and stories. It's no wonder you understand your job or your family so well — you've spent so much time engaging with them that you're practically an expert on these topics!

But what about all those other fields that effect our lives? We rely on experts to explain the world to us. If our head aches, we don’t study medicine so we can diagnose ourselves. We go to a doctor or two and rely on their diagnosis. If our car breaks down, we don’t take an auto mechanics course. We take it to a garage we trust. That’s what we do in politics and economy as well. Both are extremely important, and we realize we shouldn't just ignore them, but they are not a high enough priority to dedicate vast amounts of our personal time to understand them. We need an expert to mediate them for us. We need someone else to invest the time and effort to dig into the data, filter out all the irrelevant details, and compose an emotionally evocative story that we can understand, and if necessary, act upon. We need to hear it from the storytellers.

The story ecosystem

Like with most things that are not a top priority for us, we are "satisficing" — choosing the first alternative that sufficiently satisfies our need rather than the best one. So for everything that we want to understand without investing our own personal time in, we rely on the storytellers at the media. Traditionally, that meant mass media: newspapers, television and radio. These are still the most effective channels for spreading a message to the masses, but they are no longer the only way. In the past two decades independent news sites and blogs became a credible source of stories as well (albeit most entertain much smaller crowds). Another type of storyteller that takes part in this ecosystem is the academia.

The work division between these three archetypal actors could be roughly portrayed as follows:

  1. Traditional media
    News reporters are required to produce content on a daily basis. As a result, they favor the simple and sensational over the complex and rational. Their stories usually reach the largest crowds, depending on the popularity of the news outlet they work at.
  2. Blogs
    Bloggers publish only when there's a story worth publishing. They can also follow a story closely for longer periods (some of them are dedicated to only a single niche topic). Their stories usually reach their followers and readers of other web outlets that links to them, which is usually a significantly smaller crows compared to traditional news outlets.
  3. Academia
    Academic researches study their field of expertise for much longer periods of times. They favor the complex and rational over the simple and sensational. Their work is sometimes newsworthy (read: relevant to the average Joe), but gets published in the academic journals, which circulate only within the academic world.

These over-simplistic descriptions do not accurately describe reality (some news reporters have the privilege to write investigative pieces, some bloggers only write gossip, and some researchers jump from channel to channel to promote their latest study). But they are useful for understanding the ecosystem. All three are dedicated to their fields of expertise, with different degrees of publishing requirements and constraints. What’s important to understand is that they feed each other.

Constantly pressured to produce content, news reporters rely more and more on other sources to supply them with news stories. Sadly enough, PR agencies often fill this need, offering "news" that are actually promoting the hidden agendas of their clients. However, blog posts and academic studies may contain stories that could be edited into a newsworthy story. Occasionally, bloggers would write about new studies, adding their own input and expertise, essentially mediating them to the news reporters. If, for the sake of argument, we put aside the hidden agendas of commercial mass media, we can conclude that a story that circulates enough in the blogosphere and academia would eventually get picked up by the mass media and spread to the public. It might take a while, and it might get a lot of exposure online before it reaches the traditional media, but it will eventually get out to the masses. This is how most people get their news and learn about politics and economy. If we want to reach them, we need to inject the story ecosystem with our open data.

Serving our real clients

If we stop fantasizing about the masses flocking to our open data systems, and only aim for the storytellers, we no longer have to:

  • Explain why our data is important. Our users already understand.
  • Interpret the data for our users. They are already literate.

So this is what you’re going to do next: get in touch with your real clients (reporters, bloggers and researchers), and design for them. As soon as the project begins, look for allies. These are the storytellers that operate in the realm of the data you want to open up. Contact them, tell them about your project, ask them about their data needs. What data are they missing? What would they do with the data if they had it? If you are going to give them valuable, unattainable data, they will feel indebted and obliged to collaborate with you. You don’t need many allies when you start off. One or two or three are enough to get you started.

Once you understand the data needs, all you have to do is focus on getting the data and presenting it in a clean and aesthetic way, so storytellers can come and fiddle with it. The actual practice of presenting data is out of this article’s scope. Suffice to say there is a scientifically right way to present data. I urge data visualizers at all levels to ignore the entire infographics industry altogether and dive into Show Me The Numbers by Stephen Few ASAP. After reading this book, you will know how to present data.

Let your open data be the fertilizers of important stories, and let others, who are better at it, tell these stories instead.

Like what you read? Give Yosef Waysman a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.