Releasing Open Data

Dyfrig Williams
Doing better things
6 min readJul 19, 2017

I’ve previously blogged on how the Wales Audit Office is looking to challenge our existing use of data and technology as part of the Cutting Edge Audit project. My role on the project has been to look at how we acquire data.

How does Open Data fit into acquiring data?

The below diagram shows the rationale behind my work. For me, we need to share the data that we have in order to develop our relationships with our client bodies so that we can gather data effectively. Part of this is about “being the change that you want to see.” Auditors are sometimes seen as being risk averse, but in the Good Practice Exchange we’ve seen that when we work differently, we enable others to do the same. A number of local authorities have reported back to us about how they’ve been able to challenge the limitations of the websites that they’re able to visit and the social media that they can access because of how we share knowledge. By making data openly available, we can demonstrate that there is little risk, as long as the process is well managed.

As I mentioned in my original post, finding an appropriate dataset was more challenging than I thought that it might be as we often don’t have the right to share the data that we collect from clients during our audits. However after a bit of research, we found the data behind the Local Government Financial Statements report, which is a report on local government bodies’ accounts. This was safe data to release because it’s already available on each council’s website as part of their accounts, but we are the only organisation that collates this data. The data within the report is analysed on a national basis, but by releasing the dataset we can enable councils and other interested stakeholders to look at the data on a county by county basis and to compare and contrast their accounts against others. The data is used by the Wales Audit Office to support local audit work and for general benchmarking. The report itself looks at the quality of accounts, and is based on the data that’s released before amendments — we don’t keep track of prior-period adjustments.

How did we go about making this data open?

Our starting point was a spreadsheet that we use internally that contains the datasets dating back to 2008–09. We were a bit disappointed to learn that the requirements for local authorities to provide this data in this structure has now changed, so there won’t be comparable data available next year. However, this dataset served as a good test for a future approach. In the longer term it would be worth us looking at how we could make continuous data available in order to reduce the burden of reporting requirements. Lucy Knight from Devon County Council has a really useful example that we can draw on in her lunchtime lecture for the Open Data Institute on Making open data happen in local government.

We used ♻ Hendrik Grothuis ☘’s post on making data open and the Open Data Institute’s Consumers Checklist as rough guides for the process. Our first step in cleaning up the data was to look at which data was ours to share, and which data was already available from other sources. We decided to remove the data that was already made available through StatsWales to avoid duplication, but should you want to think about using this dataset with some of the ones that we used internally, these may provide a good starting point:

We then used CSV Lint to check whether the file was readable. We were pleased to discover that we had a valid file, but we also found ways that we could improve it. We turned the dataset around so that the data items go horizontally and the years go vertically. We also created a null value to indicate where the data was unavailable. A quick Google search was enough for us to discover how to note empty cells.

As a Welsh public sector organisation, we are required to make the data available bilingually, so we sent it to the translators to make sure that we got each technical term exactly right.

Publishing the data

When it came to publishing the data, we decided to publish it as part of this post on the Good Practice Exchange blog. It would be a very lonely looking dataset on an Open Data platform at the moment, but the hope is that we can identify other datasets that we can release going forward. We looked at potential platforms that could be used, including open source options like CKAN and DKAN (both of which would integrate with our Drupal Content Management System), as well as cloud based platforms like Socrata. As an organisation we’re moving to the cloud when it makes sense, but there may be things that we could learn from Audit Scotland’s Innovation Zone, which has been set up to allow their staff to test new software and platforms in a lightly regulated space. This gives staff the opportunity to test new ways of working.

As per our recent webinar on Open Standards, we’ve chosen to publish the data in CSV instead of a proprietary format like Excel. This means that it can be used by a wide variety of software, and hopefully as wide a variety of people as possible.

It’s now up to us to ensure that this data is discoverable by tagging it effectively, and we will also publicise the dataset through the networks that we’ve built through our prior work on Open Data. Our next challenge is to track how the data is used, so if you do use the dataset, we’d love to have your feedback about the format and what you used it for.

Learning from the Welsh Government

The Welsh Government were a great help throughout my work on the Cutting Edge Audit project. They shared learning from their approaches, and we also attended meetings together to learn more about Cardiff and Monmouthshire Councils’ approaches. It was fascinating to hear that the Welsh Government’s own staff use StatsWales to share and gather data as it’s open and transparent. This is something for us to think about in our own journey forward — how we can make data more accessible for both internal and external stakeholders.

We ended up using the Welsh Government’s approach to Metadata as a template for our own work. Metadata is a set of data that describes and gives information about other data, and it’s really important because it gives context around the data that is being shared. You can find the metadata at the bottom of this post alongside a link to the data itself.

Feedback

Your feedback on our approach here is really important. As this is an initial test of how we might make data open and shareable, your feedback will be used to shape how this might progress. As an organisation, we’re very keen to look at how we can make better use of data to help public services improve, and also to walk the talk in terms of our own digital practice. The Auditor General for Wales talks about enabling innovation through well managed risks before every one of our shared learning events. We’re looking to share our own learning so that people can learn from our experiences, be they good or bad. We always say that there’s no point reinventing the wheel. By working openly and transparently, we hope that organisations can build on what we’re doing so that they can share data as effectively as possible in order to improve the services that they provide.

Dataset: Local Government Financial Statements

Metadata

Glossary

This post originally appeared on the blog of the Good Practice Exchange at the Wales Audit Office.

--

--

Dyfrig Williams
Doing better things

Cymraeg! Music fan. Cyclist. Scarlet. Work for @researchip. Views mine / Barn fi.