Open Data Services
Published in

Open Data Services

Why do open organisational identifiers matter?

A screenshot of the homepage of org-id.guide

At Open Data Services we build and support multiple data standards working across complex policy areas. There’s one feature all of the data standards we work on have in common — references to organisations.

In fact, if you’re working with data that deals with just about anything to do with public life or economic activity, you’ll probably come across some kind of reference to some form of organisation — from companies to charities, community organisations to public bodies. But how do you know if one organisation is the same as (or different to) another?

org-id was created in 2017 by a partnership of open data standards groups in order to help solve this problem. It aims to fill a gap in the commons of shared data infrastructure — a way of finding and using open identifiers which are unique to an organisation, stable over time and have clear provenance.

What’s in a name? The benefits of using organisational identifiers

Names alone are slippery and unreliable identifiers; they can be translated into different languages, represented as acronyms or initialisms, or change when organisations restructure.

For example, a recent report by the Centre for Humanitarian Data looked at data published using the International Aid Transparency Initiative (IATI) data standard. In it, they found seven different ways of naming to the United Nations Refugee Agency:

  • UNHCR
  • UNHCR/United Nations High Commissioner for Refugees
  • UNITED NATIONS HIGH COMMISSIONER FOR REFUGEES
  • UNO Flüchtlingshilfe
  • United Nations High Commissioner for Refugees
  • United Nations High Commissioner for Refugees (UNHCR)
  • United Nations Office of the United Nations High Commissioner for Refugees

While a human might be able to infer these strings all refer to the same organisation, a computer won’t. In order to use that data to determine the full extent of money flowing from and to the UNHCR, you’d need to spend time cleaning it and creating a common identifier.

The easiest way to simplify this problem is to include organisation identifiers as common element(s) that appear across datasets, making organisational records easier to combine. If an open, unique and stable identifier is published in the dataset to begin with, the amount of time spent cleaning that data is reduced, making it much more useful, usable and in use.

What makes a good organisational identifier?

Most databases use identifiers in some form. For example, Open Data Services Co-operative Ltd is registered at Companies House, the official register for companies in the United Kingdom. The Companies House database has assigned us the company number 09506232. Within the context of the Companies House database this identifier is unique — it’s enough to identify our co-operative alone

When you need to start combining lists of organisations across multiple registers, things get more difficult. There is a risk that another organisation listed on a different register may also use the identifier 09506232. In this context, we can’t be confident that the Companies House identifier alone can unambiguously refer to our co-op.

A screenshot of Open Data Services Co-operative Limited on Companies House. https://find-and-update.company-information.service.gov.uk/company/09506232

A good global identifier draws on an external list, ideally an official source like a government register. It should be machine readable, unique and permanent. And it should have two parts: a part that identifies the external list, and a part that is the unique reference to the organisation on that list.

org-id is essentially a list of organisation lists. Those lists, in turn, provide a common protocol to help create and understand structured references to organisations of all types.

org-id provides a code for each list. In theory, by using that code as a prefix for the identifier, you should be able to uniquely and unambiguously identify an organisation. For example, using org-id our co-operative can be uniquely identified using the identifier of GB-COH-09506232.

A table showing how org-id constructs an identifier, using Open Data Services Co-operative Ltd as an example. The Companies House list code is GB-COH, and the Companies House identifier is 09506232. Together, these combine to create GB-COH-09506232.

For some other examples, Org-id provides prefixes for charity registers in Argentina (AR-CENOC), Bangladesh (BD-NAB) and Canada (CA-CRA_ACR).

This protocol helps people form a reference to either their own organisation, or partners they work with. So long as you know where the registration is held, org-id may be able to help start the process of creating a reference.

The growing case for org-id

The ultimate promise of open data is the value of bringing different types of data together to provide transparency and useful insights, whether that is for communities, governments or businesses.

Alongside making it easier to identify an organisation within a single dataset, organisational references are crucial when you’re trying to join the dots between datasets — for example, combining data about beneficial ownership with data about public procurement.

More and more open data initiatives are recognising the importance of organisational identifiers. For example, the IATI Strategic Plan for 2020–2025 highlights “traceability” as a priority for data about international development. As a result the Aid Transparency Index, which uses IATI data, has created a networked data indicator to assess how organisations provide information about other organisations participating in their activities.

In this context, org-id is a critical piece of open data infrastructure — it’s the only way to unambiguously identify any type of organisation across the globe — including non-profits, trusts, government bodies, charities and foundations.

Next steps for org-id

It’s still early days for org-id. The information about organisations is currently sourced and researched manually by third-parties, which is a time consuming and error-prone process, especially when registers change their systems over time.

org-id doesn’t yet have comprehensive global coverage in terms of its records of registers — in fact, some registers don’t provide standardised identifiers, or aren’t even digitised. This is an ongoing challenge that Open Data Services can’t address alone.

In order to be sustainable org-id will need:

  • Good quality, up to date content. For example, well-described organisational identification schemes with complete coverage globally and coverage of all types of organisations across different registers.
  • Transparent, responsive governance. For example, clarity around how organisational identification schemes are added, updated and removed, and where decision making power lies.
  • Usable, useful technology to support management of, and access to content.
  • Support from the open data community to help shape org-id, alongside contributing organisations and organisation registers.

Stay tuned — we’ll be sharing more about next steps for org-id soon.

At Open Data Services we’re always happy to discuss how developing or implementing open data standards could support your goals, or how we could help you publish or use open data. Find out more about our work and get in touch.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store