Open Data Lost Ground in 2015: An Analysis of the 2015 OKFN Open Data Index

Alice Corona
SILK STORIES
Published in
6 min readDec 24, 2015

The Open Knowledge Foundation just published the 2015 edition of its Open Data Index. The Index provides the most comprehensive snapshot available of the global state of open data”. (Notes on methodology at the bottom of the article).

Using Silk’s data publishing platform and OKFN’s dataset, I’ve made a resource that lets you explore patterns and get fresh insights into the current state of government open data. Some interesting points? Taiwan ranks as the best open data country around. Most governments are happy to make data public, but they fail to apply open licenses and implement machine-readability. Government spending and land ownership datasets, two of the most critical data types controlled by governments, are typically hard to find and difficult to access. Here’s the details.
Note: Silk has been discontinued as of Dec. 15th 2017 so links are broken and visualizations are static. Will replace them asap.

Score of assessed datasets per country (out of 100). Filtered for Americas and for Land Ownership Datasets

Taiwan ranks first as the most open data-friendly country. UK, Denmark, and Colombia follow

With 78 points (out of an possible 100), Taiwan wins the medal of most open country. Taiwan jumped ten positions from 2014 and has jumped 35 since 2013. The US rank 8th, together with the Netherlands.

None of the top 15 ranking countries is in Middle East/North Africa. The “most open” country in the region, Oman, comes in only at 66th place (out of 122).

Top 15 countries

UK, Germany, and Nigeria lost most points in comparison to last year’s evaluation.

The United Kingdom, the second most open country in the rankings, actually scored 21 points less in 2015 than it did last year. That’s the largest drop on the list. Scores for Germany and Nigeria both fell by 20 points. This doesn’t necessarily mean that the countries ranked much lower. For example, the UK lost only one position. Most countries actually saw a poorer score this year than in the previous year. In fact, 55 of the 80 countries that have been assessed both in 2015 and in 2014 scored lower this year. This almost 70% of them. In other words, open data is actually losing ground.

Country’s difference in 2015 score vs. 2014 score

Interactive: Find the Least and Most Open Datasets for Your Country

Adjust the filters in the following chart to find which datasets score best and worse in your country.

The Biggest Obstacles to Publishing Open Data: Machine-Readability and Open Licenses

More than half of the 1,586 datasets assessed this year are public and are available online. This shows governments’ intention towards openness.

That said, only a few datasets manage to pass an even more critical test — machine readability. This is important because cleaning up data that is not machine readable is incredibly time consuming and expensive. Another key failing of most published open data is that it lacks a so-called “open license”. In fact, only 24% of the 2015 datasets are machine-readable and only 12% are released with an open license.

Click on the image for the interactive version

Extrapolating Data from PDFs is Terribly Complicated. Yet The PDF is the #3 Favorite Format for Publishing Government Data

PDF stands for Portable Document Format. It is the default way to publish static information in a portable, downloadable format. The trouble is, PDFs are like roach motels. Data goes in but its nearly impossible to get it out. Converting tables or, even worse, unstructured data trapped in PDFs to machine readable information requires a tools, skills and time. This serves to reduce the utility of that data for professionals, let alone for the general public. All told, 106 key datasets are published in this format (although some are simultaneously published in other formats).

Number of 2015 datasets published in each format

What are the Least Open Datasets? Probably those on Government Spending, Land Ownership, and Water Quality

Only three countries out of 122 assessed by the OKFN had a dataset on government spending that scored at least 70 out of 100 in openness: Greece, UK, and Brazil. This makes government spending the hardest dataset to access as open data. That’s ironic because government spending is perhaps the most important type of open data for citizens, journalists, academics and even regulators who want to monitor government spending and activities. Interestingly, countries are far more likely to publish government budget data, which does not contain specific details on spending dollars in line-item detail. Government budget data is the second most open data category, according to OKFN, after national statistics.

Open data on land ownership also is grudgingly divulged by governments. Only Denmark, Uruguay, Georgia and Jamaica scored 70 points or higher.

Number of 2015 datasets with a score over 70 out of 100

Datasets with a score over 70 out of 100. Filtered for Government Spending Datasets

The Most Active Submitters of the Community: Mor Rubinstein, Bruce Hoo Fung and Tryggvi Björgvinsson

Top 10 submitters for number of datasets submitted

Number of datasets reviewed by each reviewer. Filtered for Italy

Interactive: Find Links to the Datasets You Can/Can’t Access in Your Country

Adjust the filters in the following chart to find which datasets score best and worse in your country. Consider lobbying for closed or uncollected key datasets in your country!

About the Open Data Index

The Open Knowledge Foundation just published the 2015 edition of its Open Data Index. The Index provides the most comprehensive snapshot available of the global state of open data”.

The Index is elaborated from crowdsourced information, as volunteers submit datasets for each country which are then reviewed. Each country is evaluated on the release of 14 different key datasets.

(Note: The Index assesses government’s openness in publication of their datasets. It “does not look at the broader societal context — for example the legal or policy framework, (FOI, etc.) — and it also does not seek to assess use or impact in a systematic way. Lastly, it does not assess the quality of the data.

Datasets Assessed in 2015: Company Register; Election Results; Government Budget; Government Spending; Land Ownership; Legislation; Location Datasets; National Map; National Statistics; Pollutant Emissions; Procurement Tenders; Water Quality; Weather Forecast

Each of these datasets is then assigned a score from 0 to 100, in terms of a series of parameters:

  • Does this dataset exist in the country (5 points)?
  • Is it public (5 points)?
  • Is it in digital format (5 points), available online (5 points) in bulk downloads (10 points)?
  • Is the data free (15 points) and timely updated (10 points)?
  • Is it machine readable (15 points) and released under an open license (30 points)?

--

--

Alice Corona
SILK STORIES

Stories with data, from the data collection (or scrape) to the data visualization. Data storytelling instructor, data journalist