Why we need to research open data use and why you should care about it

The sole purpose of this blog post is to kick start the discussion around open data use research (again). You are probably asking yourself — why?! Well, I believe open data usage is a puzzle that is yet to be solved. Unlike other data sources, one does not have to attribute the use of open data. No attribution makes it hard to track and understand how people use data. It also means that if you are loud and make noise people will know that you used open data. However, in a lot of cases, people don’t have to mention the data source or don’t discuss it enough or know how to promote their tool.

When looking at the Open Data Common Assessment Framework we look at four aspects — context, data, use and impact. A lot of research has been done on context, data ,and impact, but less on use of the data. This lack of research is for me a big issue since use leads to impact.

Some will say that data use is part of impact research already, and should not be seen as stand alone. I believe that this is not the case. Use is a big part of user research and understanding needs; Understanding the population and who needs a boost in using it (where gender can be a big thing), it assists in understanding data quality and to create better feedback loops. It is not only about impact — it feeds into the data and context aspects as well.

Two years ago @JoshCowls, Corinne Cath and I received a small Nesta grant to research open data use in 5 different locations. We found four key conditions that promote utilisation of open data — data quality, supportive community, data literacy and crisis as a catalyst for use. While the paper we published helps us to understand what promotes data use and innovation, it does not help to understand one basic thing — how to track use in the first place?

Image: Matt Taylor

So when I started working at 360Giving as the Data Labs and Learning Manager, I knew that if I wanted to drive more use of our data, I needed to understand what users were currently doing with the data. If I figure out what is done *now* I can understand the challenges and successes, which will allow me to make a plan on how to move it forward.

360Giving data, unlike open government data, is not a compulsory dataset that needs to be published by law. The majority of our publishers are charities and foundations, and they voluntarily release data because they think that sharing their data can help achieve better grant making, and ultimately better influence on social processes. These grantmakers are the publishers of the data, and they don’t work with open data portals, they just publish a link to their website. This means that we as the 360Giving staff do not have access to their sites so we can’t collect basic metrics like how many downloads of the data are made on a yearly basis. I am not looking for methods that are only applicable to government data; I want to see how is also relevant to my research subject, the 360Giving data.

So, when I want to understand *how* people use the data, this is my approach:

  1. I check in search engines (Google and DuckDuckGo) for the terms “360Giving”, “threesixtygiving” and “GrantNav” to see if I find any mentions of reports, links, visualization, news piece, etc. I run this search every first Monday of the month. The results can be found in this spreadsheet.
  2. As search engine sometimes miss some use cases, where people actually used our data but didn’t acknowledge it, I asked some of 360Giving board members and staff to add the exaaples they know exist.
  3. I have access to GrantNav, a search engine of all published grants data that is open to everyone to use. Here I have access to vast analytics of how people use the data and what they are searching.I have analysed our GrantNav use, mainly in how people search in it, to understand their abilities and understanding of the tool. We did this to help us improve GrantNav. You can find the preliminary report.

After all of this I still felt stuck. So I did what most people do, I posed this question to the Twitter world to hear from the open data community:

And the replies arrived pretty quickly!

Danny Lammerhirt from Open Knowledge International explained the factors that we need to consider well:

Danny’s analysis is helpful, but we need to remember that in enabling conditions we should add not only technocratic barriers like license and quality, but also data literacy and understanding of data processes. If you don’t how open data use looks like, how will you look at it at the first place?

Bottom lines — if we don’t know who are our users and how they use open data, we can’t improve and we can’t be more inclusive. We need to shed some light on this and more than the example of case studies that we have.

Everyone’s replies were great (and appreciated!). They showed that there is work in progress on the matter of use. I am writing this post in the hope that this will help us to share notes publicly, helping one another to promote more discussion around open data usage.

If you would like to help us, here’s how:

  1. Comment on this post and help us keep the discussion alive. Tell us how you look at data use.
  2. If you used 360Giving data help us to record your use by email us at mor.rubinstein@threesixtygiving.org The more we know about the way you use the data, the better we can improve it.

Special thanks to Ruba Ishak for editing my blog