Introduction to Resource Description Framework and SPARQL(RDF 101)

Atakan Güney
7 min readOct 1, 2019

--

Well, in this blog post I try to explain what Resource Description Framework is and what it is used for. I will try to cover following sections during the blog:

  1. What does RDF stands for ?
  2. Why do we need such a framework ?
  3. RDF Structure
  4. RDF Example
  5. What is Sparql ?
  6. Sparql “Code” Examples
  1. What is RDF ?

Resource Description Framework is a standard data model on the Web. It facilitates data to interchange across websites. Most fascinating feature is by using RDF data, websites can share information even if they use different schemas, thanks to the data modeling languages like OWL and RDFS which are developed on top of RDF.

For detailed explaination of RDF, read W3C’s wiki page.

2. Why do we need such a framework ?

Since the ubiquitos spread of the Internet and usage of social media, the data collected from users has been growing at a pace. Besides collecting data, how to benefit of it is another problem. The basic reason is that every website has its own data model to store user of object information and this brings us to the problem of connection.

As you can guess, this problem was concerned by some clever guys from the W3C before and they recommended a data model which we call it today as Resource Description Framework in 1999. In 2004, they published RDF 1.0 specification and the last specification of it is RDF 1.1 which is published in 2014.

3. RDF Structure

RDF data model based on 3 parts: subject, predicate, objects. Aim at this model is to relate subject with object by predefined predicates. For example:

Illustration of RDF Structure

As above graph states, subject can be resources which are located on the web, predicates are property-types which are predefined before(predicates can be both from well-kwon ontologies like FOAF or we can define our relationships by using extensions of RDF loke OWL or RDFS), objects can be both resources or literals.

There are some different serialization formats of RDF. By serialization format, I mean basically format of data stored in the file. These formats are:

Despite I use Turtle format, I encountered a lot of example online which are in RDF/XML format. It seems that before Turtle, RDF/XML was the most used one.

4. Example

That is enough description 😃 Lets see an example RDF data.

@base <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rel: <http://www.perceive.net/schemas/relationship/> .

<#green-goblin>
rel:enemyOf <#spiderman> ;
a foaf:Person ; # in the context of the Marvel universe
foaf:name "Green Goblin" .

<#spiderman>
rel:enemyOf <#green-goblin> ;
a foaf:Person ;
foaf:name "Spiderman", "Человек-паук"@ru .

In the above example, there are records related to Spiderman and Green Goblin which are enemies. We defined first Green Goblin. It is “enemy of” Spiderman and “a person” and its name is “Green Goblin”. Other record is about Spiderman. It is “enemy of” Green Goblin and it is “a person” and its name is “Spiderman” which is “Человек-паук” in Russian. Example is written in Turtle serialization format which is one of the serialization format used for RDF data.

As you can observe, RDF structure has very basic format to express relationship between subject and object. By using the format, many resources of the web can be related and knowledge graphs can be constructed. On the other hand, the most profound feature of RDF is machine interpretable structure. So, machine can infer the data collected all over the web, if the data is stored in RDF format.

5. SPARQL

Until now, I only mentioned about data format and the beauty of it 😅 But without getting specific parts of the data, how can thousands of gigabytes data be useful. Can you imagine such a mass 😧 Again thanks to some smart guys who recommended a query language for RDF data, we call this language “SPARQL”. If you are familiar with SQL, you can find similarities between them. For example, they both have SELECT and WHERE keywords, etc.

Lets consider the SPARQL query below:

SELECT ?name
WHERE {
?x foaf:name ?name
}

This query should return “Green Goblin” and “Spiderman”, “Человек-паук”, if we consider our dataset is example given above.

Lets dive in little bit. First of all, the RDF data example can be written as following:

@base <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rel: <http://www.perceive.net/schemas/relationship/> .

<#green-goblin> rel:enemyOf <#spiderman> .
<#green-goblin> a foaf:Person .
<#green-goblin> foaf:name "Green Goblin" .

<#spiderman> rel:enemyOf <#green-goblin> .
<#spiderman> a foaf:Person .
<#spiderman> foaf:name "Spiderman" .
<#spiderman> foaf:name "Человек-паук"@ru .

Our query match with “?x foaf:name ?name” format. Corresponding records are

<#green-goblin> foaf:name "Green Goblin" .
<#spiderman> foaf:name "Spiderman" .
<#spiderman> foaf:name "Человек-паук"@ru .

By the way, “@ru” specifies language of literal “Человек-паук”.

6. SPARQL in Python

To see all code and some results, please visit my repo on GitLab.

Now, lets see some coding practices. I am mostly comfortable with python for now. Therefore, I implemented all codes in python 3.6 and in addition to that you need to install rdflib, rdfextras and sparqlwrapper libraries but don’t worry you will see the “Requirement.txt” in my humble repo 😅

Before doing some real life queries, lets implement our example queries we talked before.

import rdflib
import rdfextras
rdfextras.registerplugins()

file_path = "sample.ttl"
g = rdflib.Graph()
g.parse(file_path, format="turtle")

results = g.query("""
SELECT ?name
WHERE{
?x foaf:name ?name
}
""")

for r in results:
print(r[0])

The result would be:

Green Goblin
Spiderman
Человек-паук

Lets see some real life queries 😃 By saying so this, I mean querying from DBPEDIA. It has also sparql query client in itself. You can run queries directly on the browser. So, lets continue:

  • SELECT type queries
from SPARQLWrapper import SPARQLWrapper, JSON

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?x
WHERE { ?x dbo:location dbr:Turkey }
"""
)
sparql.setReturnFormat(JSON)
results = sparql.query().convert()

with open("select-results.txt", "w", encoding="utf-8") as f:
for result in results["results"]["bindings"]:
f.write(result["x"]["value"] + "\n")

Sample results are:

http://dbpedia.org/resource/Anadolu_Agency
http://dbpedia.org/resource/Anadolu_Airport
http://dbpedia.org/resource/Syrian_National_Council
http://dbpedia.org/resource/Cebeci_İnönü_Stadium
http://dbpedia.org/resource/Köprübaşı_Dam
http://dbpedia.org/resource/Lake_Nemrut
http://dbpedia.org/resource/Mado_(food_company)
http://dbpedia.org/resource/Selimiye_Barracks
http://dbpedia.org/resource/Selimiye_Tunnel
http://dbpedia.org/resource/Telsiz_ve_Radyo_Amatörleri_Cemiyeti
http://dbpedia.org/resource/Yıldız_Palace
http://dbpedia.org/resource/İstinye_Park
http://dbpedia.org/resource/45th_Antalya_Golden_Orange_Film_Festival
http://dbpedia.org/resource/ANKAmall
...

In this query, we got resources which are related to “Turkey” by the relation “location”. Cool, hah 😍

So, lets continue with my favorite

  • DESCRIBE type queries
from SPARQLWrapper import SPARQLWrapper, TURTLE
from rdflib import Graph

# Describe Query
sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
DESCRIBE <http://dbpedia.org/resource/Turkey>
"""
)

sparql.setReturnFormat(TURTLE)
results = sparql.query().convert()

g = Graph()
g.parse(data=results, format="turtle")

with open("describe-results.ttl", "wb") as f:
f.write(g.serialize(format='turtle'))

Results will look like following:

@prefix dbc: <http://dbpedia.org/resource/Category:> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix dbp: <http://dbpedia.org/property/> .
@prefix dbpedia-cs: <http://cs.dbpedia.org/resource/> .
@prefix dbpedia-eu: <http://eu.dbpedia.org/resource/> .
@prefix dbpedia-fr: <http://fr.dbpedia.org/resource/> .
@prefix dbpedia-id: <http://id.dbpedia.org/resource/> .
@prefix dbpedia-it: <http://it.dbpedia.org/resource/> .
@prefix dbpedia-nl: <http://nl.dbpedia.org/resource/> .
@prefix dbpedia-pl: <http://pl.dbpedia.org/resource/> .
@prefix dbpedia-pt: <http://pt.dbpedia.org/resource/> .
@prefix dbpedia-wikidata: <http://wikidata.dbpedia.org/resource/> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix georss: <http://www.georss.org/georss/> .
@prefix lgdt: <http://linkedgeodata.org/triplify/> .
@prefix ns10: <http://dbpedia.org/resource/Crossing_the_Bridge:> .
...

The result is huge 👌 This query returns RDF data about the resource. So, we got RDF data about “Turkey” which we explicitly queried the resource.

  • ASK type queries

This type of query returns true or false based on our query. Lets see.

from SPARQLWrapper import SPARQLWrapper, XML

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
ASK WHERE {
<http://dbpedia.org/resource/Turkey> rdfs:label "Türkei"@de
}
"""
)

sparql.setReturnFormat(XML)
results = sparql.query().convert()
print(results.toxml())

This returns

<?xml version="1.0" ?><sparql xmlns="http://www.w3.org/2005/sparql-results#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3.org/2001/sw/DataAccess/rf1/result2.xsd">
<head/>
<boolean>true</boolean>
</sparql>

The last one is

  • CONSTRUCT

In this query type, a RDF data constructed with results by given construct. Lets see the example

from SPARQLWrapper import SPARQLWrapper, XML, JSON

sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX schema: <http://schema.org/>
CONSTRUCT {
?lang a schema:Language ;
schema:alternateName ?iso6391Code .
}
WHERE {
?lang a dbo:Language ;
dbo:iso6391Code ?iso6391Code .
FILTER (STRLEN(?iso6391Code)=2) # to filter out non-valid values
}
"""
)

sparql.setReturnFormat(XML)
results = sparql.query().convert()
with open("construct-results.ttl", "wb") as f:
f.write(results.serialize(format='turtle'))

Sample result:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://dbpedia.org/resource/Abkhaz_language> a schema:Language ;
schema:alternateName "ab" .

<http://dbpedia.org/resource/Afar_language> a schema:Language ;
schema:alternateName "aa" .

<http://dbpedia.org/resource/Afrikaans> a schema:Language ;
schema:alternateName "af" .

<http://dbpedia.org/resource/Akan_language> a schema:Language ;
schema:alternateName "ak" .

<http://dbpedia.org/resource/Albanian_language> a schema:Language ;
schema:alternateName "sq" .

<http://dbpedia.org/resource/Amharic> a schema:Language ;
schema:alternateName "am" .
...

So, we get languages and their codes and we got new RDF graph based on our construction.

Conclusion and Future Blogs

Actually, I wanted to write a blog about spary queries on distributed system Apache Spark. But I realized that it is early to write such a complex blog 😅 In comming weeks, I may write blogs about this topic.

To conclude, RDF is exciting tool or model language or whatever you name it. It is definitely exciting idea to gather all data on the web and make it machine understandable. For interesting blogs about Social Semantic Web tools, please stay tuned 😃

References

  1. https://www.w3.org/RDF/
  2. https://www.dataversity.net/introduction-to-sparql/
  3. https://rdflib.github.io/sparqlwrapper/
  4. https://www.w3.org/TR/turtle/
  5. https://www.wikiwand.com/en/Resource_Description_Framework#/Serialization_formats
  6. https://www.w3.org/TR/rdf-sparql-query/
  7. http://www.linkeddatatools.com/semantic-web-basics

--

--