City data marketplaces are a distraction, let’s improve data infrastructure

I chaired a debate at the Open Data Institute (ODI) titled “What does a good data market look like?” on 29 April. It was a timely debate.

Data is infrastructure at city, national and global levels. It is vital to our societies. It is important to strengthen it. Stable, reliable and well-maintained data infrastructure helps us make better decisions, it brings us new services and it supports innovation.

There are voices arguing that we need to move beyond the open data portal. Nesta have argued that the Mayor of London should build a city data marketplace and Hitachi are building one in Copenhagen.

At the ODI we are keen to learn — and to share what we learn — so we arranged a debate to discuss the idea of a city data marketplace. The panellists were Eddie Copeland the Director of Government Innovation at Nesta; Leigh Dodds a founder of Bath:Hacked and ODI associate; and Yodit Stanton the founder of Open Sensors, a startup building IoT data infrastructure. Questions were taken from the audience in the room whilst those watching the live-stream asked questions via the #ODIFridays hashtag.

Chairing the debate firmed up my views.

A city data marketplace is just another centralised website, we need to move beyond that model and improve how we can discover data on the web. At the same time we can build a better market for data and spend more time experimenting and learning about the role of governments and cities in that market. A better market for data can help strengthen our data infrastructure.

Better and more open data infrastructure will help both cities and the organisations that work in them solve problems and deliver services.

A city data marketplace is different to a market for data

A city data marketplace has previously been described as “an online marketplace that connects organisations and individuals that have useful data with those that want it.”. During the debate it was also described as an “appstore for data” and a “TaskRabbit or eBay for data”. The marketplace would support both open data and data shared for a fee. It would support data publication and exchange between the public and private sector. A city data marketplace is in one place, on one platform and focussed on a city.

The data spectrum.

By contrast, the market for data uses the web where organisations publish data and APIs. The market for data supports the full data spectrum. Like the city data marketplace it includes both open data and data shared for a fee; and supports data publication and exchange between the public and private sector. The traditional open data portal helps people discover public sector open data in the market for data but the market for data is mostly decentralised, just like the web. The market for data already exists.

Data marketplaces have failed before

Leigh Dodds shared his experience as product manager of a company that tried and failed to build a sustainable data marketplace.

He related the tale of multiple other organisations that have tried or failed with only those focussed on specific sectors, such as social media data, living to tell the tale. The debate did not surface other examples of successful data marketplaces, certainly not one that supported more than a single sector. A city supports multiple sectors.

The debate recognised that existing web publishing capabilities and search techniques were not always making it easy to find data in the market whether it be published by governments, city authorities or businesses.

Laura Koesten, a PhD student based at the ODI, is researching this problem. It is a hard one. As Benedict Evans has observed “All curation grows until it requires search. All search grows until it requires curation“.

There were some user needs that people thought weren’t being met by data portals

The debate discussed some user needs that existing open data portals may not be adequately meeting. I was unconvinced that a city data marketplace would help meet these needs any more than the existing market for data.

Even seemingly simple needs such as discovery are affected by a number of factors including the literacy of the person searching for data and how the data has been described during publishing.

More complex needs such as sharing a common problem to get others to help fix it — whether the problem be in banking, housing, jobs or education — is similarly complex. It requires a range of on and offline activities that can take years to complete. Whole new institutions might need to be built to fully address a problem.

There is a team at the UK’s Government Digital Service (GDS) that are researching the user needs for the data.gov.uk portal. Opening up that research and considering it alongside research on city data portals will help everyone learn more about what needs exist so we can design better services to meet them.

The role of government was unclear

The panel discussed the role of government. The views included government building and hosting a city data marketplace, using the marketplace to publish its own data, using the marketplace to buy data, encouraging use of open standards and recognising that the data that government holds is data that it holds on behalf of society.

In general, the role of government in a data marketplace or a market for data was under-discussed. As the chair I take full responsibility as we ran out of time! I think this would have been the most important bit of the debate.

I believe we need to think harder about government’s role

Our governments have chosen to actively shape technology markets: for example by encouraging the uptake of open standards and open source. They are also being active by choosing to use and publish open data.

The UK government, like many around the world, recognises that “It is critical that businesses have the ability to create new and innovative products without being hampered by cost, by licensing conditions, or the inertia caused by uncertainty and doubt.”

But the plans in Copenhagen and the ones that Nesta have floated for London include a marketplace that helps people buy and sell data. Paid for data often has a licence that restricts how you can use it: for example some Surrey councils can’t publish planning applications as open data because of their data supplier. This reduces the value that people can create from the data. Governments and cities that are open-by-default should not be encouraging paid data models.

As Jeni Tennison recently said:

Using data to make a decision is like travelling on a series of roads. To get from point A to point B without open data is like stopping at toll booths at each road junction.
Some journeys you just wouldn’t want to make because they are too much of a pain. So some decisions you will not be able to make because that data is too difficult to access.

Perhaps people think it is necessary to pay to get the private sector to provide data but many businesses, whether it be large enterprises or startups like Guru Systems and Open Sensors, and, as Yodit pointed out, the customers of Open Sensors are making the same choice as governments. They choose to publish some data as open data as it helps their businesses grow, solve problems and deliver services.

Good governments and cities will, where it is useful, use this open data to improve their services. Better ones will go further and encourage more open data.

Government could encourage mobile phone operators to publish the aggregated footfall data that they use for network planning or credit card companies to publish aggregated consumer spend data that they already collected and aggregate. Cities could choose to encourage taxi firms or, in the future, driverless car operators to publish aggregated open data such as traffic congestion or road maps. More open and collaborative mapping models can reduce costs for businesses and the public sector.

To take London as a specific example: the Mayor has responsibility for transport. Wouldn’t aggregated open data from private sector transport firms help a city meet the needs of citizens regardless of who owns the tube train, bus, black cab, car, or bicycle they happen to use? A more efficient transport market is better for the citizens that use it, the public and private sector firms that provide services in it, and the politicians that have democratic responsibility for it.

By encouraging a more open data infrastructure governments and cities won’t just deliver more efficient public services they will support innovation, transparency and accountability; help everyone get better services and grow our economies in the process.

We need to think more about our market of data

I came away from the debate unconvinced by the idea of a data marketplace as it was described. I do not think that our market for data needs another centralised website owned by a single organisation. The web is at its best when it is as open and decentralised as possible, so is our data infrastructure.

The debate did help me deepen my thinking about the market for data and its importance for the future though. If we are to strengthen our data infrastructure and make it as reliable and open as possible, so that it can help support innovation whilst respecting better principles for personal data usage, then we do need to improve that market and encourage it towards openness. Rather than building city data marketplaces perhaps our cities should experiment and learn how to improve the market and their data infrastructure by getting open data out of more organisations and making it easier to discover.

I’d love to hear more thoughts on this topic and talk to other people thinking about and, ideally, helping build better markets for data and more open data infrastructures.

Drop me a note or leave a comment if you can help.