Conceptual Relational Data Virtualization, using Existing Open Standards

Exploiting Data Virtualization using Virtuoso Relational Views over Structured Data represented as SQL Tables or RDF Graphs

Situation Analysis

Remember: “Data” is how we express “Observation” in reusable form. The "Observation" being expressed is one or more entity relationship types (relations).

Tabular Expressions of Relations

Comparison of SQL, SPARQL, and Cypher Query Languages from Stackoverflow Trends
  • Data Representation — via Entity Relationship Types (Relations) represented in Tabular form (and easily depicted as Tables)
  • Declarative Data Interaction —using the Structured Query Language (SQL) for Data Definition and Manipulation
Generic DBMS Relation Schema
SQL RDBMS Schema for Customer Relation

Graphical Expressions of Relations

RDF DBMS Schema for a Customer Class

Relational Data Virtualization

  • Data Representation — Tabular and/or Graphical Relations
  • Declarative Data Interaction — SQL or SPARQL
  • Rules Declaration — SQL (as in SQL Views) or SPARQL (as in RDF Views, Custom Reasoning & Inference Rules, and Attribute-based Access Controls [ABAC] for Data Security)

Virtuoso Relational Data Virtualization Example

Relational Data Virtualization built using existing open standards (HTTP, SQL, SPARQL, RDF, R2RML, ODBC, JDBC, etc.)

Steps to Generate R2RML Script(s)

  1. Attach ODBC-accessible Data Sources to Virtuoso. (If your preferred data source has no ODBC driver but does have a JDBC driver, just add an ODBC-to-JDBC Bridge Driver like our Lite Edition or Enterprise Edition to the mix.)
  2. Use the SQL Tables (RDBMS Tabular Relations) to RDF Sentence/Statement Relations Wizard (more commonly known as the RDB2RDF Wizard) to select SQL Tables from which RDF Views are to be generated. (This step is discussed elsewhere.)
  3. Rather than executing the script that is generated, copy and paste the generated R2RML sentences to a text/plain or text/turtle document, like northwind-8890-tables-r2rml.ttl (downloadable).

Steps to Generate RDF View(s)

  1. Use SPARQL 1.1 INSERT orLOAD, or the Virtuoso Sponger with no-sponge option set, to load the content of northwind-8890-tables-r2rml.ttl into a target named graph (e.g., urn:northwind:8890:tables:r2rml:mappings). For example:
    LOAD <northwind-tables-r2rml.ttl> INTO <urn:northwind:8890:tables:r2rml:mappings>
  2. Run Virtuoso’s built-in R2RML Processor, DB.DBA.R2RML_MAKE_QM_FROM_G, as in:
    EXEC ('SPARQL '|| DB.DBA.R2RML_MAKE_QM_FROM_G ('urn:northwind:8890:tables:r2rml:mappings'));
  3. Use one of Virtuoso’s URI-Rewrite Stored Procedures (DB.DBA.URLREWRITE_CREATE_REGEX_RULE) to create a Rewrite Rule. This amounts to leveraging the power of SPARQL to produce a specific Entity Description Document for the Entity identified in the HTTP URI lookup request.
DB.DBA.URLREWRITE_CREATE_REGEX_RULE
(
'demo_rule2',
1,
'(/[^#]*)',
vector('path'),
1,
'/sparql?query=DESCRIBE+%%3Chttp%%3A//^{URIQADefaultHost}^%U%%23this%%3E+FROM+%%3Chttp%%3A//^{URIQADefaultHost}^/Demo%%23%%3E&format=%U',
vector('path', '*accept*'),
null,
'(text/rdf.n3)|(application/rdf.xml)|(text/n3)|(application/json)',
2,
null
);
http://demo.openlinksw.com/Demo/suppliers/SupplierID/12#this

Understanding RDF View Generation Output

Note: This applies to both local and remote data. All data managed by a Virtuoso RDBMS is represented internally as relational Tables, implicitly contradicting the notion that SQL and SPARQL are mutually exclusive while also disproving the notion that Data Virtualization across Relational Models isn’t performant or scalable.

Exploring Information about RDF Views generated by a Virtuoso instance using SPARQL

PREFIX virtrdf: 
<http://www.openlinksw.com/schemas/virtrdf#>
SELECT DISTINCT ?s1 as ?quadMap
?s3 as ?rdfPredicate
?s2 as ?sqlTable
WHERE
{
?s1 a virtrdf:QuadMap .
?s1 virtrdf:qmTableName ?s2 .
?s1 virtrdf:qmPredicateRange-rvrFixedValue ?s3 .
FILTER (! contains(str(?s2),"/csv"))
FILTER (! contains(str(?s2),'csv.'))
FILTER (contains(str(?s2),"supplier"))
}
LIMIT 200
Using a Table to align a Quad Map to RDF Predicates mapped to columns in an SQL Table
PREFIX virtrdf: 
<http://www.openlinksw.com/schemas/virtrdf#>
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?s2 AS ?sqlTable
?s1 AS ?rdfQuadMap
?s6 AS ?rdfSubjectURITemplate
?s3 AS ?rdfPredicate
?s10 AS ?rdfObjectValue
?s8 AS ?rdfObjectDataType
WHERE
{
?s1 a virtrdf:QuadMap .
?s1 virtrdf:qmTableName ?s2 .
?s1 virtrdf:qmPredicateRange-rvrFixedValue ?s3 .
?s1 virtrdf:qmSubjectMap ?s4 .
?s4 virtrdf:qmvFormat ?s5 .
?s5 virtrdf:qmfCustomString1 ?s6 .
?s1 virtrdf:qmObjectMap ?s7 .
?s7 virtrdf:qmvFormat ?s8 .
?s7 virtrdf:qmvColumns ?s9 .
?s9 rdf:_1 ?s10

FILTER (! contains(str(?s2),"/csv"))
FILTER (! contains(str(?s2),'csv.'))
FILTER (contains(str(?s2),'"northwind"."suppliers"'))
}
Detailed Mapping from Tabular Relations to RDF Sentence Collection (Graph) Relations

What’s Happening Here?

What is R2RML?

Visualizing information conveyed by RDF sentences that leverage R2RML terms, focusing on the "Demo"."Demo"."Suppliers" table [Live Link]

Using SPARQL to Explore R2RML

PREFIX virtrdf: 
<http://www.openlinksw.com/schemas/virtrdf#>
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?s2 as ?sqlTable
?s1 as ?rdfQuadMap
?s6 as ?rdfSubjectURITemplate
?s3 as ?rdfPredicate
?s10 as ?rdfObjectValue
?s8 as ?rdfObjectDataType
WHERE
{
?s1 a virtrdf:QuadMap .
?s1 virtrdf:qmTableName ?s2 .
?s1 virtrdf:qmPredicateRange-rvrFixedValue ?s3 .
?s1 virtrdf:qmSubjectMap ?s4 .
?s4 virtrdf:qmvFormat ?s5 .
?s5 virtrdf:qmfCustomString1 ?s6 .
?s1 virtrdf:qmObjectMap ?s7 .
?s7 virtrdf:qmvFormat ?s8 .
?s7 virtrdf:qmvColumns ?s9 .
?s9 rdf:_1 ?s10 .
FILTER (! contains(str(?s2),"/csv"))
FILTER (! contains(str(?s2),'csv.'))
FILTER (contains(str(?s2),'"northwind"."suppliers"'))
}

How Does Virtuoso Exploit R2RML?

How are Tables transformed into RDF Graphs?

  • Each Table Name functions as a Relation Identifier — a collection of Tuples representing entity relationships grouped by a predicate (in this case, the Table Name)
  • A Relation Header is derived from a set of Attribute Names — depicted as Header Rows (Column or Field identifiers)
  • A Row (or Record) is derived from a set of intersections between Tuples and Attribute Names
  • A specific Row is uniquely identified by a selection of Tuple and Attribute Name intersections (usually one, but may be multiple), subject to the worldview of the relation modeler
Illustrated Relation (except Relation Header/Heading)
Illustrated Relation (including Relation Header/Heading)
  • Each SQL Tabular Relation is comprised of one or more Tuples and a Relation Header/Heading for delineating Columns (Fields)
  • Each RDF Relation is comprised of three Tuples — a Subject, a Predicate, and an Object
  • Each SQL Table identifies an RDF Entity Type (or RDF Class)
  • One or more designated SQL Column(s) uniquely identify each RDF Entity — an instance of the RDF Entity Type or RDF Class that identifies the Subject of an RDF sentence
  • Each SQL Column Name (or SQL Field) identifies an RDF Entity Attribute or RDF Entity Relationship Type — an instance of an RDF Property Class that’s identified by the Predicate of an RDF sentence
  • Each SQL Column Value identifies an RDF Entity Attribute Value associated with a specific RDF Data Type— an instance of an RDF Entity Type that identifies the Literal Object of an RDF sentence
  • Each SQL Column Value functioning as a Foreign Key identifies an RDF Entity Attribute Value associated with an RDF Reference Type — an instance of an RDF Entity Type or RDF Class that identifies the Object of an RDF sentence via a URI (e.g., an HTTP URI)

Deploying RDF using Linked Data Principles

  1. Identify the Subject of each RDF sentence with a Hyperlink (specifically, a HTTP URI).
  2. Identify the Predicate of each RDF sentence with an HTTP URI.
  3. Identify the Object of each RDF sentence with an HTTP URI or a Literal (which may be typed or untyped).
Effect of using HTTP URIs (Hyperlinks) to Identify Entities Unambiguously

From Linked Data to a Semantic Web

Using HTTP URIs (Hyperlinks) to identify the Subject, Predicate, and Object [optionally] of a sentence

Live Examples

Conclusion

Related

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Kingsley Uyi Idehen

CEO, OpenLink Software —High-Performance Data Centric Technology Providers.