Customizing R2RML Scripts in Virtuoso

Generating RDF Linked Data from SQL Relational Data

Hugh Williams
OpenLink Virtuoso Weblog
13 min readApr 19, 2018

--

Why

There are a wide range of shared ontologies/vocabularies (Friend-Of-A-Friend [FOAF], Organization, Bibliographic, etc.) available to any data integration practioner seeking to create RDF-based Linked Data. Naturally, this availability also extends to those seeking to create RDF Views over existing data, now represented as Tables in a typical ODBC/JDBC-accessible RDBMS that supports the SQL Query Language.

From the SQL RDBMS perspective, there are also cases where data straddles multiple columns across a variety of tables as part of a data alignment effort that underlies the creation of effective RDF Views.

In this post, I’ll describe how Virtuoso leverages that semantic richness in R2RML, creating RDF Views over SQL Tables using terms from multiple shared ontologies while also handling the structural disparity that exists in a typical RDBMS schema.

What

The Virtuoso Conductor can produce R2RML scripts through an automated wizard. These auto-generated scripts build a generic ontology from the SQL relational table(s).

Virtuoso users can customize these basic scripts to map automatically-generated column-types to class-properties from an external ontology, or to map multiple table columns to a single RDF relation, among other things.

How

Prerequisites

  • This article will use the Virtuoso sample Schools database (schools_db_dav.vad). This VAD can be downloaded (for Virtuoso 7.x or for Virtuoso 8.x) and installed through the System Admin → Packages tab of the Virtuoso Conductor. After installation, we’ll be focusing on two of its tables — "School"."DBA"."Course" and "School"."DBA"."Department".
  • Three other VADs — the Faceted Browser (fct_dav.vad), the R2RML Processor Module (rdb2rdf_dav.vad), and the Linked Open Data Transformation Middleware (“Sponger”) (cartridges_dav.vad) — also need to be installed. The latest version of each of these VADs can be downloaded from the OpenLink website, and installed through the System Admin → Packages tab of the Virtuoso Conductor.

Generate base R2RML script

First the base R2RML script needs to be generated for the two tables, which can be done using the Virtuoso Conductor from the Linked Data → Views tab by:

[1] Select the School Database (qualified by the literal identifier "School") from the drop-down menu.

[2] Tick the checkboxes for the "School"."DBA"."Course" and "School"."DBA"."Department" tables.

Tables to be used in R2RML mapping

[3] Click on the Generate via Wizard button.

[4] Then click the Prepare to Execute button to generate a collection of RDF sentences/statements (a/k/a, an R2RML Graph) that describes how the selected tables are to be transformed. Note: at this point, you have an R2RML Graph that describes mapping rules using terms from the W3C’s R2RML ontology.

Virtuoso generated base R2RML Graph

[5] Copy the content of the area labeled R2RML Graph to a local text editor for customization purposes. (Note: Whitespace in the code-blocks below has been adjusted to improve readability. Any sequence of multiple whitespace characters — spaces, tabs, line-feeds, etc. — may be collapsed to a single whitespace character, if you wish.)

Note: This article uses our live demo instance at http://demo.openlinksw.com, hence that domain is used in the prefix declarations used to qualify terms used in this particular R2RML Graph.

Customize Mappings using terms from a Shared Ontology

In this section, we’ll enhance the basic mapping template using a third party ontology. Here, we’ll map theSchool.DBA.Department table to a Class in the FOAF Ontology.

FOAF Ontology Class Mappings

Virtuoso includes a number of preloaded ontologies. You can obtain a list of Virtuoso’s “known” ontologies (i.e., those imported into the database instance at some point in time) from the Linked Data → Schemas tab of the Conductor, or by running a SPARQL query (as shown below) scoped to entities of type owl:Ontology. Note: ontologies only need to be loaded for use in the Conductor Linked Data View wizard to perform the mappings, such that classes and properties of the ontology can be presented for selection in the wizard.

The instances of owl:Ontology are identified by the variable ?onto. For one of these instances, the variable ?onto resolves to the constant http://xmlns.com/foaf/0.1/, which is the HTTP URI that identifies the Friend of a Friend (FOAF) vocabulary. Thus, we can proceed to mapping the Name column of the School.DBA.Department table to the foaf:name property associated with the foaf:Organization class.

[1] Add the foaf: prefix declaration to the preamble of the R2RML script:

[2] Add the foaf:Organization class to the list of objects (values) for the rr:class relation (attribute) of the School.DBA.Department mapping:

[3] Replace School:name with foaf:name as object of the rr:constant relation (attribute) thereby mapping it to the "Name" column of the School.DBA.Department table:

[4] Having made these changes, we can see their effect in the generated RDF Linked Data Views— the foaf:name property mapping — with a query like the one below. Note: This query uses SPASQL, a/k/a SPARQL-in-SQL, through the iSQL interface. A similar query could be run through the /sparql endpoint, by dropping the leading SPARQL keyword and the trailing semicolon, which are part of SPASQL syntax, leaving only SPARQL syntax.

Organization Ontology Class Mappings

An ontology does not have to be preloaded into Virtuoso to be used in R2RML mapping scripts.

Here we’ll use steps similar to the prior section to apply additional customization to the mapping template, this time using terms from the Organization Ontology to map theSchool.DBA.Department table to the org:Organization property of the org:OrganizationalUnit class.

[1] Add the org: prefix declaration to the preamble of the R2RML script:

[2] Add the org:OrganizationalUnit class to the rr:class attribute of the School.DBA.Department R2RML mapping. Note: one table column may be mapped to multiple relations (attributes) in RDF.

[3] Here’s the final mapping:

[4] As before, having made these changes, we can see their effect in the generated RDF Linked Data Views — the org:Organization property mapping — with a query like the one below.

Map multiple columns to a single RDF relation

Multiple columns, from one or multiple tables, can be mapped to a single RDF relation by using the R2RML rr:sqlQuery specifier to construct a custom SQL query to merge the column names together.

Here, we’ll combine the "School"."DBA"."Department"."Name" column with the "School"."DBA"."Course"."Title" column to produce a composite of the Department Name and Course Title, i.e., "Department — Title" with the query:

A R2RML mapping can also be created directly from a query, rather than going through a table definition, by using the rr:sqlQuery relation instead of the rr:logicalTable relation:

The following triple pattern is the mapping of the DeptTitle column alias, which is a concatenation of the two columns (i.e., concat(d.Name,'-', c.Title)) in the query specified in rr:sqlQuery:

Having made these latest changes, we can see their effect in the generated RDF Linked Data Views — the School:depttitle mapping of the concatenated column values — with a query like the one below.

Complete Script for Creation of Linked Data Views

The Conductor’s Linked Data View wizard also generates a set of URL rewrite rules for generating dereferenceable Linked Data View URIs for the generated triples, enabling them to be viewed with a Browser or other user agent using Virtuoso’s built-in /describe or /about proxy services.

The following SQL script shows how the custom R2RML mappings created in this post can be used to repeatably generate the dereferenceable Linked Data Views for the two sample tables.

Note: This script was generated for execution against our live demo instance at demo.openlinksw.com, which listens at the default HTTP port, 80. To execute this script against another Virtuoso instance, simply replace all occurrences of http://demo.openlinksw.com/ in the script with thehttp://hostname:portnumber/ of your Virtuoso instance.

Sample Linked Data View Pages

The screenshots below show some of the HTML pages generated from these Linked Data Views. Note: The caption of each screenshot is a live hyperlink that identifies a sample entity generated from these mappings.

http://demo.openlinksw.com/School/department/DepartmentID/1#this
http://demo.openlinksw.com/School/DeptCourse/CourseID=4061#this

Related

--

--