Creating a data standard for infrastructure transparency: building it

Open Data Services
Oct 7, 2020 · 6 min read

We’ve been working with CoST — The Infrastructure Transparency Initiative and the Open Contracting Partnership to develop the Open Contracting for Infrastructure Data Standard, an open standard for the publication of joined-up data about infrastructure projects and contracts.

Read the first part of this series: Creating a data standard for infrastructure transparency: laying the foundations.

Image for post
Image for post
Photo by Yancy Min on Unsplash

In the first part of this series we made the case for OC4IDS — a standard for publishing joined-up data about infrastructure projects. We explained how the standard builds on the CoST Infrastructure Data Standard and the Open Contracting Data Standard.

In this blog post, we’re looking at how we built OC4IDS — how we structured the standard, how we thought more broadly about reuse, and the tools and resources we’ve built to support implementation.

Building the schema…

Our aim was to take the disclosure requirements from the CoST IDS and to turn them into a schema for the publication of open data on infrastructure projects and contracts which can be joined-up with detailed contracting data published in OCDS.

Let’s zoom in and look at an example of how OC4IDS adds structure to one of the disclosure requirements from the CoST IDS and how this makes publishing and using data easier.

OC4IDS represents the firms involved in an infrastructure project using the Organization building block, made up of several fields.

Image for post
Image for post
The Organization building block

For each field in OC4IDS, the schema provides:

  • A title and description, so that publishers know what data to provide and users know how to interpret the data, e.g. “the party’s role(s) in the project, using the open partyRole codelist”
  • A type and format, so that publishers know how to format the data and machines know how to process it, e.g. an array of strings
  • Optionally, an associated codelist, which limits the possible values of the field so that data from different publishers is comparable, and provides titles and descriptions for each possible value, e.g. the partyRole codelist

As well as making data on infrastructure projects easier to publish and use, OC4IDS also opens up new opportunities for using data.

For example, rather than just including the names of the firms associated with an infrastructure project, OC4IDS also encourages publishers to provide organisation identifiers.

Publishing identifiers for organisations makes it possible for users to identify where the same organisation appears under different names, and to connect with other data sources on beneficial ownership, corporate filings, and more. This is important for many types of analysis, including identifying corruption, measuring competition and understanding the market.

We followed this process for all of the disclosure requirements in the CoST IDS, resulting in a schema that defines the structure, format and meaning of more than 200 individual data elements.

You can read more about the structure of the OC4IDS schema in the Getting Started documentation.

Building the documentation

The site includes introductory materials for new users, reference tables for the schema and codelists, an interactive schema browser, and guidance for publishers and users.

Image for post
Image for post

Based on the research phase of the project, we developed guidance on how to include project identifiers in contracting data along with step by step guides on how to publish data from an infrastructure transparency portal and how to use data from procurement systems for infrastructure monitoring.

We also created a fully completed and annotated worked example, which publishers and users can use as a reference to supplement the schema and codelists.

Finally, whilst the OC4IDS can be used by anyone who wants to publish data on infrastructure projects, we documented two mappings for the benefit of CoST’s member programmes:

  • A mapping to the CoST IDS, which describes how to use OC4IDS to meet each of the disclosure requirements in the CoST IDS.
  • A mapping to OCDS, which describes how existing OCDS data can be used to populate some of the fields in OC4IDS and thus meet some of the requirements in the CoST IDS.

Connecting the services: helpdesk, resources, the Data Review Tool and training materials

Recognising this, we’ve worked with OCP, CoST and Centro De Desarrollo Sostenible to set up a number of support services.

The bilingual global helpdesk (English and Spanish) is a free service for anyone interested in publishing or using OC4IDS data, through which we provide advice and support at each stage of the implementation process.

Support can include helping potential publishers to scope the options for OC4IDS implementation in their context, advising on mapping existing data sources to OC4IDS, and providing feedback on the quality of published data.

Image for post
Image for post

Alongside our activities supporting publishers, we’ve also developed templates and resources to guide them through key stages of the implementation process. These include a scoping template and a field-level mapping template.

A key resource for both implementers and the helpdesk itself is the Data Review Tool, a self-service tool that provides feedback on the quality of OC4IDS data. Implementers can use the tool to get feedback as they work towards publication. The helpdesk also uses the tool to run checks on data shared by publishers.

Image for post
Image for post

The Data Review Tool is based on CoVE, which we created to power data review tools for OCDS, 360Giving and the Beneficial Ownership Data Standard. By reusing the core technology from CoVE we were able to quickly adapt and deploy a fully-featured Data Review Tool for OC4IDS without the time and cost involved in starting from scratch.

The final piece of the adoption support package is training. Through the helpdesk we’ve delivered training sessions, workshops and webinars on OC4IDS implementation to CoST members and other implementers. Along the way, we’ve created a library of reusable training resources, including slide decks and interactive workshop activities.

In Part III of this series we’ll be looking at what we’ve learnt from working with implementers to put the standard to use, and the challenges and opportunities for infrastructure procurement transparency.

At Open Data Services we’re always happy to discuss how developing or implementing open data standards could support your goals, or how we could help you publish or use open data. Find out more about our work and get in touch.

opendatacoop

Open Data Services Co-operative

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store