What Can the Semantic Web Do for GraphQL?

Or: GraphQL + URIs = global knowledge graphs

Szymon Klarman
Dec 17, 2019 · 7 min read
Source: https://www.skyandtelescope.com/interactive-sky-chart/

What if GraphQL resources were annotated with URIs — global (Semantic Web / linked data) identifiers denoting concepts from shared vocabularies, such as schema.org or other dedicated ontologies?

Say, for instance, we associate this fragment of a GraphQL schema:

type Person
name: String
children: [Person]

with the following mapping to schema.org URIs:

"Person": "http://schema.org/Person"
"name": "http://schema.org/name"
"children": "http://schema.org/children"

Such an association would basically inform that what is meant by type Person within this specific GraphQL service is the concept of person as defined at http://schema.org/Person, by field name — the property http://schema.org/name, and by childrenhttp://schema.org/children.

The availability of such mappings, analogous to something called the context in JSON-LD objects (“LD” stands here for linked data), could help lift separate, local GraphQL services and data they expose, into globally shared semantic space — whether in public or closed data environments.

This post is intended to make this precise claim:

GraphQL + URIs = global knowledge graphs


GraphQL and the Semantic Web share some key aspects stemming from their essentially overlapping motivation, namely: organizing distributed data resources into connected graph structures, with the semantics provided via machine-accessible metadata models — (Semantic Web) ontologies or (GraphQL) schemas. And, yes, the focus of GraphQL is in principle much more “operational”, being set on how to efficiently fetch and bring data together from different sources. Meanwhile, the gist of the Semantic Web vision is strongly “declarative” — concerned with how to properly describe and publish connected data in distributed environments, in order to facilitate its meaningful consumption by automated agents. Still, the commonalities between the two technologies prevail and have been apparent from quite early on.

As a result, a number of analyses and technical proposals have been contributed in the recent past, suggesting ways of bridging the two territories (see, e.g.: [1]-[9]). The common perspective dominating these efforts so far has — quite unsurprisingly —been that of the Semantic Web, which considers GraphQL mostly as a convenient, alternative query interface for accessing linked data. That’s obviously a justified and a valid strategy, which follows a broadly emerging trend of endorsing GraphQL as a universal database access layer [10]. In this specific case, the use of GraphQL wrappers promises additionally to lower the notoriously high entry threshold for mainstream developers unacquainted with RDF, SPARQL and related W3C standards, while happy to work with GraphQL.

This, however, is not the only possible perspective, and arguably not the one with the greatest potential.

A converse approach, likewise of a highly pragmatic appeal, would be to let GraphQL benefit from some core Semantic Web mechanisms and existing resources in order to enhance its own data integration and publishing capabilities.

Consider again the example above. By merely adding that global semantic context (i.e., the mapping from the GraphQL schema to URIs) two correlated goals are achieved:

  1. the meaning of the schema resources gets disambiguated and made interpretable by 3rd party machine agents, such as search engines who “speak” schema.org. Note that the inherent semantics of GraphQL schemas, accessible via so-called introspection queries, is by default local and limited to the confines of that single service only;
  2. an external semantic leverage is provided for supporting federation of schemas, queries and data across different GraphQL services, allowing to connect their resources into larger, coherent knowledge graphs. And that just by reusing already existing linked data resources.

For example, another GraphQL service containing a seemingly different type to Person, namely:

type Human
fullName: String
parentOf: [Human]

could be still seamlessly aligned with the previous one, given the context:

"Human": "http://schema.org/Person"
"fullName": "http://schema.org/name"
"parentOf": "http://schema.org/children"

Based on both mappings, one can determine that data about people can be found in both services, in one classified under the Person type and in the other under Human.

But how pervasive would such an addition be from the perspective of GraphQL? None at all. The philosophy which we suggest here could be summed up as:

“pure GraphQL inside” and (optional) “linked data outside”

with:

  • no extensions to existing GraphQL tooling required;
  • no interference with the GraphQL service development practices;
  • complete agnosticism regarding the back-end data sources or storage technology.

The linked data layer is only an external add-on here and can be engaged simply when needed just to solve the problems that it is really intended for, i.e.: reconciling semantic connections between different data resources in distributed environments.

Schema.org — GraphQL — JSON-LD

As a final takeaway, let us serve a slightly bigger example generated with two live playgrounds:

The former is a simple Node.js API, which automatically spins up a GraphQL service from ontologies expressed in the schema.org datamodel, by casting their structure and semantics into GraphQL schemas. The latter allows to process JSON-LD objects, i.e., plain JSONs augmented with a semantic context, between different, semantically equivalent formats, including RDF.

The example shows how arbitrary data served with a GraphQL service can be virtualized as linked data and connected into a knowledge graph, grounded in the global semantic concepts of schema.org, given only a mapping of GraphQL schema to the corresponding URIs.

We consider a schema fragment:

type Query {
Person: [Person]
}
type Person {
_id: ID!
_type: [String]
name: Text
children: [Person]
parent: [Person]
}
type Text {
_value: String!
}

and a query fetching objects of type Person with their names children and parents:

{
Person {
_id
_type
name {
_value
}
children {
_id
}
parent {
_id
}
}
}

As (a part of) the response to that query we might get a JSON object of the following shape:

{
"data": {
"Person": [
{
"_id": "henry",
"_type": [
"Person"
],
"name": {
"_value": "Henry Borrow"
},
"children": [],
"parent": [
{
"_id": "richard"
}
]
},
{
"_id": "margaret",
"_type": [
"Person"
],
"name": {
"_value": "Margaret Moneypenny"
},
"children": [],
"parent": [
{
"_id": "richard"
}
]
},
{
"_id": "richard",
"_type": [
"Person"
],
"name": {
"_value": "Richard Borrow"
},
"children": [
{
"_id": "henry"
},
{
"_id": "margaret"
}
],
"parent": []
}
]
}
}

The use of the schema.org-driven data structures is insignificant here. What matters is just the GraphQL schema and its mapping to URIs of the form:

{
"Thing": "http://schema.org/Thing",
"Person": "http://schema.org/Person",
"name": "http://schema.org/name",
"children": "http://schema.org/children",
"parent": "http://schema.org/parent"
}

with just a little extra JSON-LD sugar on top:

{
"_id": "@id",
"_value": "@value",
"_type": "@type",
"@base": "http://example.com"
}

Such mapping is maintained by the service and accessible via the _CONTEXT query. Let us then combine the above pieces into a single JSON-LD object of the form below and feed it into the JSON-LD playground:

{
"@context": {
"_id": "@id",
"_value": "@value",
"_type": "@type",
"Thing": "http://schema.org/Thing",
"Person": "http://schema.org/Person",
"name": "http://schema.org/name",
"children": "http://schema.org/children",
"parent": "http://schema.org/parent",
"@base": "http://example.com"
},
"@graph": [
{
"_id": "henry",
"_type": [
"Person"
],
"name": {
"_value": "Henry Borrow"
},
"children": [],
"parent": [
{
"_id": "richard"
}
]
},
{
"_id": "margaret",
"_type": [
"Person"
],
"name": {
"_value": "Margaret Moneypenny"
},
"children": [],
"parent": [
{
"_id": "richard"
}
]
},
{
"_id": "richard",
"_type": [
"Person"
],
"name": {
"_value": "Richard Borrow"
},
"children": [
{
"_id": "henry"
},
{
"_id": "margaret"
}
],
"parent": []
}
]
}

There we translate it into RDF and further convert to the Turtle syntax, getting the final RDF dataset — a knowledge graph with semantics grounded in global URIs:

@prefix schema: <http://schema.org/> .

<http://example.com/henry>
schema:name "Henry Borrow" ;
schema:parent <http://example.com/richard> ;
a schema:Person .

<http://example.com/margaret>
schema:name "Margaret Moneypenny" ;
schema:parent <http://example.com/richard> ;
a schema:Person .

<http://example.com/richard>
schema:children <http://example.com/henry>, <http://example.com/margaret> ;
schema:name "Richard Borrow" ;
a schema:Person .

The resulting graph can be easily visualized in a triples store, such as GraphDB:


The adoption rate of GraphQL as a framework of choice for building structured data APIs and the momentum of development of its open-source tooling ecosystem is quite impressive by many standards. The power of data modelling, integration, and serving given into the hands of mainstream developers can have a highly transformative impact, not only on the way modern Web applications are built, but much broader, on the way data is shared, published and consumed in diverse distributed information spaces. Semantic Web with its primary value offer as a Web-based data integration middleware, can greatly contribute to that transformation if used sensibly, symbiotically and without imposing its technical overhead on the engineering practices of GraphQL developers.

So here’s the message from the Semantic Web to GraphQL: put your services in context to help others understand your data better and relate it across your endpoints, across different applications, and further across the Web.


Thanks are due to Anna Konieczna, Artur Haczek and Peter Dresslar for comments and feedback on this post. And furthermore, to Artur Haczek for his lead role in the implementation of the current version of the Staple API playground.


Szymon Klarman

Written by

knowledge representation and reasoning | linked open data | AI | logic (http://epistemik.co)

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade