What do we mean by data standards?

We talk about data standards a lot at Gather. But what do we actually mean?

The United States Geological Survey gives a helpful definition:

“Data standards are the rules by which data are described and recorded. In order to share, exchange, and understand data, we must standardise the format as well as the meaning.”

Data standards are not glamorous. In fact, they may be even less glamorous than toilets and sewage! But data standards are essential. Data standards help to make the complex simple, and help businesses, governments and nonprofits use data to solve real world problems.

Data standards help to make the complex simple, and help businesses, governments and nonprofits use data to solve real world problems. (Photograph of data scientist at work taken during our data dive in March 2018.)

We all benefit from data standards every single day. Take buses. We all want to find out how to get from A to B by a bus. In the UK, bus data is standardised according to the TransXChange standard. Every bus company across the country abides by TransXChange rules when they record and share data on bus routes, timetables, fares, accessibility and live bus locations. All of this data is then fed into the Traveline dataset, and can be accessed by anyone through the NextBuses API. Apps like GoogleMaps and CityMapper use the data in Traveline to tell you how to get to where you want to go. Without a standard for bus data, bus companies would be left to record their data in completely different ways. We would be stuck consulting a different timetable for every bus company we travelled with.

So, how do data standards relate to urban sanitation?

In November 2016 we went looking for a dataset that would show us where all the toilets in the world were — and where they weren’t. By this point, we had interviewed 100 sanitation organisations, and we knew that most of them were collecting data on the locations of toilets in areas where they worked. We searched, but we could not find a comprehensive dataset or map.

There is a reason the dataset does not exist. A lot of data has been collected, but a lot of it has been collected and saved in very different ways. Sanitation data is not standardised. And this creates a problem.

Organisations are not recording things like the location, type of toilet or the amount of waste that they collect in the same way. As a result, no one can create a baseline with information on toilets from a variety of sources. This has led to huge duplication and wasted effort. Too often, the same areas have been surveyed and the same data collected over and over again in slightly different ways. This is unsustainable and wasteful in a sector with already limited resources.

Imagine if sanitation organisations working in the same city were able to share data points and create a map of the sanitation needs in a city. It would immediately empower local decision makers to better understand the problem in their cities, advocate for investment, and track progress to achieving sanitation for all.

Data standards would empower local decision makers to better understand the problem in their cities, advocate for investment, and track progress to achieving sanitation for all. (Image of screenshot of our demo platform for global data sharing.)

Data standards do not just happen. They need to be practical, useful and endorsed by key stakeholders if they are to be implemented. This is one of the reasons we are launching the Sanitation Data Commission.

We want to bring together leading, expert voices from the worlds of data, sanitation and technology to oversee the creation of a data standard for urban sanitation data. To start with, we will look to standardise how location, type of toilet and volume of waste are recorded.

We have written before about our belief in the power of partnership. It is our hope that this new Commission will help create a standard that can be applied globally. If you are interested, find out how you can get involved in the Sanitation Data Commission.