On the Mutually Beneficial Nature of DBpedia and Wikidata

https://clientmanagementvn.files.wordpress.com/2013/09/partnerships-in-ad-agency-client-relationship-at-theoutsideviewblog-com-e13300335017541.jpg

DBpedia and Wikidata are two related and similar, but still very different, Linked Data projects, both built around Wikipedia.

DBpedia’s focus is on generating Linked Open Data from Wikipedia documents.

Wikidata’s focus is on creating Linked Open (meta)Data to supplement Wikipedia documents.

Both projects provide access to their respective Linked Open Data via SPARQL Query Service endpoints.

Best of all, DBpedia and Wikidata are extremely complementary — filling in different gaps — when it comes to collecting a broad range of information about topics of interest.

Here, I offer a simple demonstration of the mutually beneficial nature of these two projects, through a Federated SPARQL Query that will combine data from both of these rich data sources.

Demo Tools

What is a SPARQL Query Service Endpoint?

A SPARQL Query Service Endpoint is an HTTP-based service (also known as a Web Service) access point to which SPARQL Queries are dispatched, en route to performance of data definition or data manipulation operations against an actual relational database. For SPARQL endpoints, the relational database supports relations (entity relationship types) represented as a collection of RDF Language statements, rather than as Records in a Table, as would be the case for an SQL endpoint.

Database Management Systems (DBMS) and other Stores that support the functionality described above are generally referred to as supporting the SPARQL Query Language, Protocol, and Results Serialization formats (HTML Tables, CSV, JSON, XML, RDF-Turtle, RDF-XML, JSON-LD, HTML+Microdata, HTML+JSON-LD, HTML+Turtle, HTML+RDF+XML, etc.).

Example Scenario

Recently, I stumbled across a pretty cool page about Galaxies that indicated its data was sourced from Wikidata. Knowing that related data is also found in DBpedia, I decided to build a query that would look up relevant DBpedia data en route to creating a broader view.

Wikidata SPARQL Query

#defaultView:ImageGrid
# Items in the Messier Catalog
SELECT DISTINCT ?item
?itemLabel
?numero
( SAMPLE(?pic) AS ?picture )
WHERE
{
?item p:P528 ?catalogStatement .
?catalogStatement ps:P528 ?numero .
?catalogStatement pq:P972 wd:Q14530 .
OPTIONAL {?item wdt:P18 ?pic } .
    SERVICE wikibase:label
{ bd:serviceParam wikibase:language "en" }
  }
GROUP BY ?item ?itemLabel ?numero
ORDER BY ?numero

Here’s the Live Query Definition Link running on Wikidata, and a Federated SPARQL Query variant on URIBurner.

Federated SPARQL Query, incorporating data from both DBpedia & Wikidata

PREFIX        wd:  <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX bd: <http://www.bigdata.com/rdf#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dct: <http://purl.org/dc/terms/>
SELECT DISTINCT         ?dbpediaID AS ?href 
xsd:string(?label) AS ?name
?description
?subjectText
?item AS ?wikidataID
?dbpediaID
?image
?picture
WHERE
{
SERVICE <http://query.wikidata.org/sparql>
{
SELECT DISTINCT ?item
?itemLabel
?numero
( SAMPLE(?pic) AS ?picture )
WHERE
{
?item p:P528 ?catalogStatement .
?catalogStatement ps:P528 ?numero .
?catalogStatement pq:P972 wd:Q14530 .
OPTIONAL { ?item wdt:P18 ?pic } .
            SERVICE wikibase:label
{ bd:serviceParam wikibase:language "en" }
}
GROUP BY ?item ?itemLabel ?numero
ORDER BY ?numero
}

SERVICE <http://dbpedia.org/sparql>
{
SELECT ?item
?dbpediaID
?label
?image
?description
?subjectText
FROM <http://dbpedia.org>
WHERE
{
?dbpediaID owl:sameAs ?item ;
rdfs:label ?label ;
foaf:depiction ?image ;
rdfs:comment ?description ;
dct:subject
[ rdfs:label ?subjectText ] .
FILTER ( LANG(?label) = "en" )
FILTER ( LANG(?description) = "en" )
}
}
}

Here’s the Live Query Definition Link (whitespace adjusted from the above, to accommodate limits of Medium.com’s link redirector in some browsers).

I can now use the federated query to produce a nice drill-down-oriented data visualization page in OpenLink’s HTML5 PivotViewer, which enables:

  • Visually-aided filtering, using Entity Relationship Types (i.e., attributes) to group the merged data
  • Isolating items of interest, once found, with exit available to either DBpedia or Wikidata
  • Sharing chosen aspects of this visualization using hyperlinks

Here are some screenshots that illustrate what I’ve outlined above.

Click here to access Live Edition — Selecting a tile leads to a link to DBpedia
Click here to access Live Edition — Selecting a tile leads to a link to Wikidata
Visually Filtering by Subject Matter Heading
Specific Galaxy of Interest

Conclusion

Given the general confusion that still swirls around RDF, Linked Data, and a Semantic Web, it’s quite easy to inaccurately perceive DBpedia and Wikipedia as rivals in a competitive zero-sum death match. Far from it! That perception utterly obscures the mutually beneficial nature of these (and other) Linked Open Data projects, as together they can deliver:

  • Powerful cross references that provide richer insights
  • Application portability — i.e., we have an HTML5 based PivotViewer that visualizes data from either (or both!) data sources, without any source-specific application modification or tuning
  • Significant demonstrations of the power of Federated SPARQL Queries

Links

Related

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.