Magic Sets and Custom Inference Rules in Virtuoso 8.x

Kingsley Uyi Idehen
OpenLink Virtuoso Weblog
6 min readApr 23, 2018
Source: http://slideplayer.com/slide/8842504/

The ability to materialize relations (i.e., sets of entity relationships grouped by a predicate) on a deductive basis is a critical feature of the genre of RDBMS application known as Deductive Database Management Systems — i.e., RDBMS equipped with Reasoning & Inference functionality. In recent times, this entire DBMS genre and its functionality have been obscured by the fanfare associated with NoSQL and Graph Databases.

In SQL RDBMS applications, materialized relations take the form of VIEWs which are described by SQL Queries and/or (in modern variants) by SQL Stored Procedures (a/k/a Persistent Stored Modules).

In RDF data management, materialized relations take the form of Inference Rules that inform the process of producing dynamically-materialized relations, colloquially referred to as Magic Sets. Magic Sets are basically the same concept as SQL VIEWs, but significantly enhanced by the semantic fidelity of relations represented as RDF sentence/statement graphs.

What?

Virtuoso supports the use of custom (magic) predicates to produce these Magic Sets or VIEWs or dynamically-materialized relations.

In Virtuoso 7.x and previous releases, the base product included only a few built-in magic predicates such as bif:contains. In those older versions, extending the collection of magic predicates required dropping down to SQL Stored Procedures — unnatural territory for those used to working with data represented as RDF sentence/statement graphs — under the misconception that RDF and SQL representations of structured data are mutually exclusive. (That is, it was thought that some structured data could be represented only as RDF sentence/statement graphs and not as SQL tabular relations; and that other structured data could be represented only as SQL tabular relations and not as RDF sentence/statement graphs. This has now been proven false; any structured data representable as SQL tabular relations may also be represented as RDF sentence/statement graphs, and vice versa.)

Virtuoso 8.0 introduced a simpler approach — an extension of its SPARQL-BI query functionality, implemented via a Macro Library abstraction. A Macro Library comprises custom inference rules that describe how to construct new Magic Sets, subject to conditions defined in those rules. A pragma declaration may now be used in the preamble of a SPARQL query to identify the desired Macro Library.

How?

Commonly used rule definitions can be grouped into Macro Libraries which enable a System’s Administrator to —

  1. Load a macro library into system metadata using a SPARQL CREATE MACRO LIBRARY or SPARQL CREATE SPIN LIBRARY statement, for invocation in SPARQL queries through thedefine input:macro-lib <macro-lib-name> pragma.
  2. Attach a macro library to specific RDF storage using the SPARQL ALTER QUAD STORAGE <storage-name> { ATTACH MACRO LIBRARY <macro-lib-name> } statement, thereby making the custom reasoning and inference context provided by that macro library available to every query without explicit pragma declaration per query.

Simple Example

Given an RDF Named Graph (Document or RDF Statements Container) identified by the IRI urn:spin:rule:geometry:lib, comprising RDF statements that describe a Custom Inference Rule for calculating the area of a rectangle as outline below:

@prefix shapes: <http://example.org/shapes#> .
@prefix spin: <http://spinrdf.org/spin#> .
@prefix sp: <http://spinrdf.org/sp#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
shapes:Rectangle
rdf:type owl:Class;
rdfs:label "Rectangle class";
spin:rule [
a sp:Construct;
sp:text """
CONSTRUCT { ?this <http://example.org/shapes#area> ?area }
WHERE {
{
SELECT ?this ?area
WHERE {
?this
<http://example.org/shapes#width> ?width ;
<http://example.org/shapes#height> ?height .

BIND ((xsd:float(?height) *
xsd:float(?width))
AS ?area) .
}
}
}
"""
] .

Here’s a one-liner (shown here on multiple lines, just for easier reading; each sequence of line feeds, spaces, tabs, and other whitespace may be collapsed to a single space character) for adding the rule described above to the Virtuoso Macro Library (where Inference Rules are stored):

EXEC 
( 'SPARQL '||
SPARQL_SPIN_GRAPH_TO_DEFSPIN
('urn:spin:rule:geometry:lib')
);

Basically, the example above leverages a built-in Stored Procedure for reading Rule Definition from an RDF document loaded into a Virtuoso Named Graph identified by the IRI <urn:spin:rule:geometry:lib>, courtesy of the following SPARQL LOAD command.

DEFINE get:soft "no-sponge"
LOAD <http://kingsley.idehen.net/DAV/home/kidehen/Public/Linked%20Data%20Documents/Tutorials/inference-rules/area-of-a-rectangle.ttl> INTO <urn:spin:rule:geometry:lib>

To reveal the lower-level code generated by Virtuoso using the following SQL command:

SELECT 
SPARQL_SPIN_GRAPH_TO_DEFSPIN
('urn:spin:rule:geometry:lib')

That outputs the following lower level Rules Definition using Virtuoso’s Macro Library extension to SPARQL.


CREATE SPIN LIBRARY <urn:spin:rule:bind:geometry:lib>
{
CLASS <http://example.org/shapes#Rectangle>
{
RULE <urn:spin:rule:bind:geometry:lib#bnode-b2256179>
{
CONSTRUCT { ?this <http://example.org/shapes#area> ?area }
WHERE {
{
SELECT ?this ?area
WHERE { ?this <http://example.org/shapes#width> ?width ;
<http://example.org/shapes#height> ?height .
BIND ((xsd:float(?height) * xsd:float(?width)) AS ?area) .
}
}
}

UNION STORAGE
}
}
}
ASK WHERE { <nosuch> <nosuch> <nosuch> }

Note: the UNION STORAGE clause ensures fusion of calcuated and existing values when inferring objects of the <http://example.org/shapes#area> relation identified by the variable ?area. Thus, if excluded inferred objects of the <http://example.org/shapes#area> relation would only comprise calculated values.

Here are links to Virtuoso SQL scripts for each of the Custom Inference Rules presented below:

SPARQL Query to List Existing Rules across Macro Libraries

DEFINE output:valmode "LONG" 
DEFINE input:storage ""
PREFIX virtrdf:
<http://www.openlinksw.com/schemas/virtrdf#>
SELECT DISTINCT ?sml AS ?macroLib
?o AS ?ruleDefinition
FROM virtrdf:
WHERE
{
?sml
a virtrdf:SparqlMacroLibrary ;
virtrdf:smlSourceText ?o .
}
ORDER BY ASC(STR(?sml))

Query Results Page

Live Link

SPARQL Query to List Rules

Query Results Page

Live Link

Rule Usage Query Examples

British Royal Family Relationships Rule

urn:spin:nanotation:demo:royal:family:lib2

Query Demonstrating Usage of British Royal Family Relationships Rule

DEFINE input:macro-lib 
<urn:spin:nanotation:demo:royal:family:lib2>
PREFIX rel: <http://purl.org/vocab/relationship/>WITH <urn:spin:nanotation:demo:royal:family>SELECT DISTINCT ?person
?hasUncle
WHERE
{ ?person
a <#RoyalPerson> ;
<#hasUncle> ?hasUncle
}
ORDER BY ASC(?person)

Query Results Page

Live Link

Rule for Calculating Area of a Rectangle

urn:spin:rule:geometry:lib

Query Demonstrating Usage of Rule for Calculating Area of a Rectangle

DEFINE input:macro-lib <urn:spin:rule:geometry:lib>PREFIX shapes: <http://example.org/shapes#>

SELECT ?s
xsd:float(?w) AS ?width
xsd:float(?h) AS ?height
xsd:float(?area) AS ?area
FROM <http://geometry>
WHERE
{
?s a shapes:Rectangle ;
shapes:width ?w ;
shapes:height ?h .
OPTIONAL { ?s shapes:area ?area }
}

Query Results Page

Live Link

Rule for Geospatial Data Cleansing

urn:spin:rule:geospatial:lib4 

Query Demonstrating Usage of Rule for Geospatial Data Cleansing

DEFINE input:macro-lib <urn:spin:rule:geospatial:lib4>SELECT DISTINCT ?x 
?y
?z
FROM <http://bostonopendata-boston.opendata.arcgis.com/datasets/465e00f9632145a1ad645a27d27069b4_2.csv>
WHERE
{ ?x a sioc:Item ;
<#hasLatitude> ?y ;
<#hasLongitude> ?z
}

Query Results Page

Live Link

Identity Reconciliation Rule

urn:spin:rule:foaf:ifp:lib1

Query Demonstrating Usage of Identity Reconciliation Rule

DEFINE input:macro-lib <urn:spin:rule:foaf:ifp:lib1>SELECT DISTINCT  ?s 
?sameAs
?mainEntityOfPage
FROM <urn:kidehen:ifp:test:2>
WHERE
{ ?s a foaf:Person ;
owl:sameAs ?sameAs ;
schema:mainEntityOfPage ?mainEntityOfPage .
FILTER ( ?s = <https://twitter.com/kidehen#this> )
}

Query Results Page

Live Link

Conclusion

As a multi-model RDBMS, Virtuoso offers the best of multiple worlds.

Effectively covered by the examples above, the familiar concept of a tabular SQL VIEW has been extended to exploit the fine-grained semantics offered by entity relationship types represented as RDF sentence/statement collections that coalesce around a specific magic-predicate.

Most importantly, Virtuoso delivers all of this functionality — without undue complexity or compromised performance — to any application that supports HTTP, ODBC, JDBC, ADO.NET, or OLE DB.

Related

--

--

Kingsley Uyi Idehen
OpenLink Virtuoso Weblog

CEO, OpenLink Software —High-Performance Data Centric Technology Providers.