Using Microsoft Power BI as a Launch Point for Semantic Web Exploration

with ODBC and Virtuoso SPASQL (SQL extended by SPARQL)

Situation Analysis

In a typical SQL RDBMS, all Keys — whether Candidate, Primary, or Foreign — are scoped to a specific RDBMS instance. Basically, these identifiers lose their unique identification properties once you enter the realm of distributed databases.

What is a Super Key?

A Super Key is an identifier that unambiguously names its referent — the entity it identifies — regardless of the context in which it is encountered. When looked up (dereferenced), a Super Key always resolves to a description of its referent, without any operating system, RDBMS instance, platform, or other dependency.

On an HTTP-based network, whether public or private, hyperlinks (HTTP URIs) can serve as such Super Keys, and provide a powerful mechanism for enhancing Reports and Dashboards produced by Business Intelligence and Analytics oriented productivity tools.

Super Keys and Business Intelligence & Analytics

Reports and Dashboards are fundamentally insight-oriented projections of data in context. Using hyperlinks — Super Keys — as entity names enables every reader to perform additional ad-hoc exploration without the need to create yet another report.

Virtuoso can add these identification and exploration capabilities to any existing ODBC-compliant Business Intelligence or Analytics tool, through a simple ODBC connection.

The rest of this post demonstrates the use of Virtuoso to inject Super Keys into Microsoft Power BI reports.

Enhancing Power BI, with Super Keys from Virtuoso


Getting Data from Virtuoso into Power BI

(1) Start your Power BI instance.

(2) Tell Power BI to Get Data from an ODBC Data Source.

(3) Choose your Virtuoso-targeting DSN, and input the SQL query (detailed below) that will get the data you want.

Virtuoso makes this work by extending SQL with SPARQL; in this case, by accepting SPARQL queries in the SQL FROM clause. The effect is a blending of entity relationship types represented as “Records in a Table” (SQL) with entity relationship types represented as “RDF sentence/statement Graphs” (SPARQL).

Here, we will query against DBpedia (i.e., Wikipedia content transformed to Structured Data form, and published in line with Linked Data principles).

This is the SPASQL (SQL and SPARQL fusion) query we’ll use —

SELECT DISTINCT AS "Actor Label", AS "DBpedia ID", AS "School ID",
DBpedia.schoolLabel AS "School Label"
PREFIX dct: <>
SELECT ?actor
dct:subject <> ;
<> ?school ;
foaf:name ?name .
rdfs:label ?schoolLabel .
FILTER ( LANG(?name) = "en" )
FILTER ( LANG(?schoolLabel) = "en" )
ORDER BY ASC(?school)
) AS DBpedia

(4) When prompted, authenticate against the ODBC Data Source.

(5) Power BI will show a preview of the query result set (also known as the query solution).

(6) Click the Load button to retrieve the complete result set.

Having retrieved data from our ODBC Data Source, we can proceed to make the DBpedia URIs (hyperlinks) live — i.e., make them clickable links — en route to exposing Platform Agnostic Super Key functionality.

Enabling Ad-hoc Exploration

(7) From the Modeling menu, select New Table. A pop-up menu in the formula entry box will help you select the Fields (a/k/a columns) to be displayed. We used —

Table = ALL(Query1[DBpdia ID], Query1[School ID])

(8) In the Home Table section of the toolbar, change Data Category: Uncategorized to Data Category: Web URL. This tells Power BI to treat the strings as links instead of literal text.

(9) You now have two columns of live links.

(10) Clicking on any link will result in your web browser retrieving a page (in this case from DBpedia) that describes whatever is identified by that link.

Video Demonstration