This weekend I got on a train to the very lovely Manchester for the second Open Data Camp. This is a quick post with notes from a session I found particularly interesting at the event.
Note: like all of my writing, this is a bit of a braindump. If you think anything should be changed, please do tweet at me to let me know!
Taking the “O” out of Open Data
This was a really quite odd and interesting set of exercises and conversations around the Open Data community, and whether referring to ourselves as such is potentially alienating others who may feel an aversion to coming if they haven’t already got something to show.
The session started by noting that on the first day, the question “what is data?” was posed and no one gave a clean-cut answer as we all interact and work with it from different angles. As part of this session is was once again asked, but in a different way — we were asked to write on sticky notes a ‘type’ of data before trying to somehow map them on a board, building relationships between different types, creating a ‘taxonomy of data’.
Restricting the Community
The session then took a very different turn, although one that I suspect was planned from the name of the session. We turned to how we refer to ourselves as a community — the Open Data community. If we removed the word “open”, would we open it up to other data enthusiasts? There were two key topics mentioned…
Why are we the Open Data community instead of just the data community?
There is quite a lot of historical context and relevance in the suggested answers. The idea of ‘open data’ was, and still is, pushed by transparency activists, where the data dumps that came as a result of this movement isn’t necessarily what we would refer to as ‘open’ data now. These data dumps would often be treated like a chore, leading to inaccurate or largely incomplete sets.
The ‘open data community’ was born out of the desire to actually do something with that data — pushing for more complete sets, and showing the transformational power of having open data in everyday operations.
Certain data sets, such as financial information that hold public authorities to account, are sets that we should be focussing on in terms of having them delivered well, pushing for their completeness and accuracy. However, operational datasets need more consideration around whether they actually do need the resource spent on them, at the cost of their quality. As a result of the early community which formed much more around transparency, are we now just pushing for data being publicly available for the sake of it?
Is our community facing an existential crisis? We talk so much about being open and the value in the technical/practical implication of working with data, yet by siphoning ourselves into such a niche, we only ever discuss what can be done to others who already agree. At the same time, we often get bogged down in what “open” actually means. This led on to the second key point:
Should the event stop being “Open Data Camp” and instead just be “Data Camp”?
This discussion started by asking “who else would come?” Would the tax man not come to Open Data Camp because they feel it holds little relevance to them?
Is the event only ever filled with progressive individuals who already buy-in to the idea of going back to their organisations and working to open up data If that’s the case, are we really doing anything of worth in terms of persuading more data to be released, and as a counter — was that ever what we were aiming to do?
We could talk about possibilities and opportunities, but only if we see the data in the first place and know what we could do with it.
What is the point of this session?
It started to feel as if the mapping of the types of data wasn’t terribly useful for the group, and instead we should be focussing on building the network of types, structure and analysis tools to discover how they relate.
To finish up this post, here are three quotes:
The vast majority of the problems that I have with data is knowing whether or not it exists, and how it relates to other sets. Having the conversation of the structure of the different data and how it interlocks, who owns it even if it is closed, would be more useful in case we actually need it and then we know who to approach.
We need to start helping others understand the value in utility, regardless of commercial or otherwise. If we know that there is value, we can target certain data to try and open up.
Perhaps what’s needed is more education on both sides around the mechanisms which are in place to make data available, because then maybe we’d understand that we can’t just make it happen all at once. As a group who deal with data every day and call ourselves experts, it feels odd that we’re discussing this big ideal as if it can somehow happen overnight.