api.parliament.uk

Features, May 2018

Samu Lang
May 8, 2018 · 5 min read

Dereference API

The base URI for our resources is https://id.parliament.uk/.
The base URI for our ontology resources is https://id.parliament.uk/schema/.

Resources can be derefenced over HTTP.

We 303 (See Other) redirect GET requests for known resources to a resource lookup endpoint on the query API, which returns a canonical representation based on a SPARQL DESCRIBE query.

We 301 (Moved Permanently) redirect incorrect HTTP resource URI requests to HTTPS out of courtesy, and we return a HSTS header for dereferenced URIs.

Developers are likely to navigate to the base URI of ontology resources (as opposed to the ontology URI) intending to find an OWL document, so we 302 (Found) redirect it to an ontology query endpoint which returns an OWL graph based on this SPARQL query.

Ontology resources (the ontology itself, its classes and properties) are dereferenced in the same way as instances.

We reply 404 (Not Found) to well-formed unknown resource URIs and 400 (Bad Request) to anything else.

The dereference API is implemented purely with XML based declarative policies in Azure API Management (similar to Amazon API Gateway).
Its sole dependency is an endpoint on our query API that determines whether a resource exists using a SPARQL ASK query.

Query API

The main consumer of our data service at the moment is Parliament’s new website. It does so via an API which consists of endpoints corresponding to parametrized SPARQL queries.
One might consider these SQL “views” or “stored procedures”.

For example, a person detail page uses data from a person_by_id endpoint, which runs this SPARQL query.

View the source of pages on beta.parliament.uk to find alternate links pointing to the query API endpoints serving them.

This API is formally described by an OpenAPI document (formerly Swagger).

Most queries are SPARQL CONSTRUCTs, but we have some SELECTs (e.g. government positions, instances of a class), a DESCRIBE and an ASK.

Several of our endpoints are federated SPARQL queries.

A minority of endpoints cannot be expressed in SPARQL against our triplestore, and are implemented as C# methods that return RDF data.

The query API negotiates content via (1) standard Accept headers (including quality values), (2) “file extensions” or (3) the format query string parameter.

For graph endpoints (those with CONTSRUCT and DESCRIBE queries) we support the default N-Triples, JSON-LD, Turtle, RDF/XML, CSV, HTML, RDF/JSON, TSV, Dot and GraphML.

For dataset endpoints (SELECT and ASK) we support the default XML, JSON, HTML, CSV and TSV.

The query API is implemented as a ASP.NET Web API application (similar to a Spring REST service in Java), running on an Azure App Service (similar to Amazon Elastic Beanstalk). It depends on our SPARQL endpoint.

The query API (as do the others) relies heavily on an open source .NET RDF library: dotnetrdf.
Some of us have contributed to the project with bug reports and pull requests: Improper implementation of default SPARQL prefixes, unexpected datetime coercion in JSON-LD parser, a new SKOS API feature, incorrect implementation of lists in RDF/XML.

Search API

The first feature to be replaced by the new website was search. The new feature is driven by our search API, an ASP.NET Web API application running on an Azure App Service.

The search API uses an external, web-wide search provider (Bing Custom search, similar to Google Custom Search) to search the entirety of parliament.uk.

Michael and Robert have written extensively about the philosophy behind this approach, fixing search by fixing browse.

The search API exposes a standard OpenSearch interface which facilitates browser and operating system integration.

A comprehensive list of 1000+ search hints (labels extracted from search result URLs by regular expressions) has been compiled but shelved.

OData API

Our OData API exposes all data on the platform over an ISO approved, OASIS standard REST API that speaks URL and JSON.

The OData API is implemented as as ASP.NET Web API application running in an Azure App Service. It translates OData queries into SPARQL queries (using a mapping layer based on our ontology) and runs them natively agains our SPARQL endpoint.

A service document lists the endpoints supported by the API.
A metadata endpoint describes the model used by the API.

Jianhan’s blog post gives an overview and technical details about the OData API.

Some sample calls:

Photo API

Images of member and peer portrait photos appear on multiple pages on the new website (member, party members, house members, constituency).
A media page lists common formats and crops.

Images are generated and served by our photo API (e.g. 1, 2, 3, 4, 5).

We store the centrepoint of each portrait image so we can automatically crop them.

The image app embeds XMP metadata from the triple-store into the generated images (some of it can be viewed online).
XMP is RDF. Statements include location, license (Dublin Core, CC BY), attribution (IPTC) and linking to the subject of the photo (“This is an Image of something that is a Person”).
An XMP sidecar file endpoint is also available which shows the actual RDF/XML representing the image metadata.

Our query API has an index endpoint that runs this SPARQL query to return a list of all person images in a tabular format (HTML, CSV, XML, JSON, TSV).

Additional (non-technical) information about the portraits project is available on the blog.
There’s a cool project calculating the average face of a member of Parliament using these images.

Infrastructure

The platform is hosted on Microsoft Azure (a cloud hosting service similar to Amazon Web Services).

The APIs applications are ASP.NET Web API applications running on Azure App Services.

Our ETL (ingest, import) processes are driven by Logic Apps triggered by Scheduler. Transformation work is done by Functions (similar to AWS Lambda) fed from Service Bus.

The triple store is a cluster of Ontotext GraphDB (similar to Neo4j and MarkLogic) instances running on Azure virtual machines.

All of our infrastructure is code written by the development team. There is no external support or administration. Computing resources are deployed by VSTS (similar to Jenkins and TeamCity). All of our code is integrated continuously (CI) and most of it is deployed continuously (CD).

The architecture for the platform was independently accredited to meet the National Cyber Security Centre’s 14 Cloud Security Principles.

Visualization links

  • One class from the ontology visualized in 2D, 3D and in OWL (WebVOWL)
  • The ontology resource (2D, 3D, OWL)
  • Ontology reconstructed from instance data (2D, 3D, OWL)
  • Viewer friendly ontology (2D, 3D, OWL)
  • One instance (2D, 3D, OWL)
  • One instance with schema (2D, 3D, OWL)
  • A shaped query (2D, 3D, OWL)
  • A shaped query with schema (2D, 3D, OWL)

Assorted links

Some things we know about constituencies

Query API

OData

People

Website by Jamie Tetlow
Backend by Matt Rayner and team
Frontend by Usman Azfal and team

Analysis by Liz Thomas and Sara Reis
Vocabulary by Anya Somerville
Search by Robert Brook
Domain model by Michael Smethurst

Built by Chris Alcock, Matthieu Bosquet, Alex Howes, Raphael Leung, Mike Marcus, Kunal Patel, Wojciech Stawiarski and Jianhan Zhu

Directed by Samu Lang

Samu Lang

Written by

Samu Lang

Technical Director building a new open linked data platform for UK Parliament

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade