Running Basic SPARQL Queries Against DBpedia

Daniel Heward-Mills
OpenLink Virtuoso Weblog
5 min readFeb 13, 2020

What is DBpedia?

DBpedia is a community project that creates and provides public access to critical structured data for what’s commonly referred to as the Linked Open Data Cloud.

— Kingsley Idehen, What is DBpedia, and why is it important?

DBpedia provides a globally accessible Knowledge Graph derived from Wikipedia content. You can query this Knowledge Graph using the powerful SPARQL query language .

Getting Started

DBpedia is accessible through its public SPARQL query editor interface.

Standard SPARQL Editor Interface

Knowledge takes the form of a collection of RDF Triples, which structures data using a subject→predicate→object object model (basically, Entity→Attribute→Value model, plus the use of IRIs and URIs as identifiers). The most basic query example can take the form of a SELECT Query where the triple-pattern in the Query Body comprises a subject, predicate, and object, and the Query Solution projection size is here limited to 10 records presented in an HTML Table.

SELECT *
WHERE
{
?s ?p ?o
}
LIMIT 10

DBpedia can be easily queried both with and without a deep knowledge of the DBpedia ontology. We can start building a query by searching on language-tagged literal values that are the objects of rdfs:label properties:

SELECT *
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en
}

@en is a language tag for declaring the national language component of a literal value (i.e., English). This feature makes both RDF and SPARQL multilingual in nature.

We can click on the resulting URI for Cristiano Ronaldo in the “athlete” column to view each relation associated with his specific URI. This process is known as dereferencing the DBpedia Identifier (an HTTP URI) that identifies the entity literally labeled as “Cristiano Ronaldo”.

Any property listed in the Property column can also be directly added to the query for further expansion.

We can directly copy the dbo:number property and add it into our previous SPARQL Query. Additional statements can be quickly added to the ?athlete variable using “;” :

SELECT *
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en ;
dbo:number ?number .
}

We have his kit/jersey number now! Let’s also add in the dbo:birthPlace property:

SELECT *
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en ;
dbo:number ?number ;
dbo:birthPlace ?place .
}
  • Live Query Results Link
How can somebody have two birth places?!

Both results are correct, since Funchal is a city within the Autonomous Region of Madeira. We can narrow the place results to only include cities by scoping the query body to instances of the dbo:City class, while being lax about the national language of the City Name:

SELECT *
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en ;
dbo:birthPlace ?place .
?place a dbo:City ;
rdfs:label ?cityName .
}
Query results without a language tag in place
  • Live Query Result Link

We can also use a language tag to designate “English” as the language associated with the literal values in the Query Solution, via a FILTER on the ?cityName variable:

SELECT *
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en ;
dbo:birthPlace ?place .
?place a dbo:City ;
rdfs:label ?cityName .
FILTER ( LANG ( ?cityName ) = 'en' )
}
  • Live Query Result Link
Much better.

For good measure, let’s double-confirm that Funchal is in Madeira, by using the dbo:region property and its value:

SELECT *
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en ;
dbo:number ?number ;
dbo:birthPlace ?place .
?place a dbo:City ;
rdfs:label ?cityName ;
dbo:region ?region .
FILTER ( LANG ( ?cityName ) = 'en' )
}

Confirmed! Now, we can decide how to present the results.

Query Result Presentation

SELECT * has been used in the previous examples to return all of the variables declared in the WHERE clause of the SPARQL queries.

When preferred, we can replace * in the SELECT List with a specific list of variables to be projected in the Query Solution Page:

SELECT
?athlete
?number
?cityName
WHERE
{
?athlete rdfs:label "Cristiano Ronaldo"@en ;
dbo:number ?number ;
dbo:birthPlace ?place .
?place a dbo:City ;
rdfs:label ?cityName ;
dbo:region ?region .
FILTER ( LANG ( ?cityName ) = 'en' )
}
  • Live Query Result Link

An additional starter query

Rather than starting a search against literal values, we can also use terms from the DBpedia Ontology (and others) to explore the knowledge graph. Users can leverage this human- and machine-readable terminology to speed up query building.

In the example that follows, we have a query that lists soccer players from Funchal, Madeira, alongside their kit/jersey numbers:

SELECT *
WHERE
{
?athlete a dbo:SoccerPlayer ;
dbo:birthPlace [ rdfs:label "Funchal"@en ;
dbo:state dbr:Madeira] ;
dbo:number ?number .
}
Soccer players from Funchal, Madeira; and their kit/jersey numbers
  • Live Query Result Link

Using the dbo:SoccerPlayer class from the DBpedia ontology narrows the results down to entities that are instances of this particular class, via the rdf:type property, which takes a class as its value.

The square brackets represent blank nodes, which can be read as declaring an indefinite pronoun such as something, someplace, etc. With that in mind, the query above is really searching for soccer players from someplace called Funchal in the Autonomous Region of Madeira.

While searching with plain text is generally optimal when you are unsure of what relations exist, using DBPedia and other ontology terms provides richer information ad more fully leverages the power of semantic relations and hyperlinks.

The queries above enable you to start a basic search into the vast information contained in DBpedia.

We will also be publishing an additional how-to on how to blend what DBpedia offers with disparate data sources such as Wikidata and/or local data sources.

Related

--

--