Well structured political data for the whole world: impossible utopia, or Wikidata at its best?

The challenges of getting well structured political data into the commons via Wikidata and how we are trying to overcome them.

--

This is a belated post accompanying a talk at WikidataCon 2017 and providing some of the key links to the things mentioned in the talk.

While the title states that this is a post about political data, much is applicable to other domains on Wikidata. I hope this will be of use to people working on issues such as data completeness and data consistency while working on other topics.

Firstly — a link to the talk itself:

And a link to the slides on Wikimedia Commons:

Overview of the slides

Quick summary:

  • We need to find ways to make it easier for domain experts who are not Wikidata experts to edit the data. Listeria combined with wd_edit may be useful tools for this as they allow users to edit simple tables directly in context.
  • A common challenge for users and editors of Wikidata is how to approach data quality, completeness and accuracy. Prompts offer one approach by allowing people to easily compare the output of a SPARQL query with a reference CSV file.
  • Some of the power user tools for use on Wikidata don’t work for certain types of political data. For example, QuickStatements is a useful tool to update data in bulk. Unfortunately, it doesn’t work for a very common scenario in politics — when people have held the same position more than once. The PositionStatements bot performs a very similar role to QuickStatements, except that it can add a role even if someone previously held it.
  • The PositionHolderHistory bot makes inconsistencies, overlaps, gaps and missing links in a chain visible. See an example for Estonian Prime Ministers.
  • Looking for a simple task to get involved? If your country uses unique identifier to refer to individual politicians in databases, make sure there is a property for that identifier in Wikidata so that the records can be matched up to external data sources.

Other links:

--

--

Lucy Chambers
mySociety.org

Plain language talker on tech. Twitter = ⚡️reactions.