Using Microsoft Excel as a Launch Point for Semantic Web Exploration

Situation Analysis

In a typical SQL RDBMS, all Keys are scoped to a specific RDBMS instance — whether Candidate, Primary, or Foreign — . When the realm of distributed databases is entered, these identifiers lose their unique identification properties.

What is a Super Key?

A Super Key is an identifier that unambiguously names its referent — the entity it identifies — regardless of the context in which it is encountered. When looked up (dereferenced), a Super Key always resolves to a description of its referent, without any operating system, RDBMS instance, platform, or other dependency.

On an HTTP-based network, whether public or private, hyperlinks (HTTP URIs) can serve as such Super Keys, and provide a powerful mechanism for enhancing Reports and Dashboards produced by Business Intelligence and Analytics oriented productivity tools.

Super Keys, Business Intelligence, and Analytics

Reports and Dashboards are fundamentally insight-oriented projections of data in context. Using hyperlinks — Super Keys — as entity names enables every reader to perform additional ad-hoc exploration without the need to create yet another report.

Virtuoso can add these identification and exploration capabilities to any existing ODBC-compliant Business Intelligence or Analytics tool, through a simple ODBC connection.

The rest of this post demonstrates the use of Virtuoso to inject Super Keys into Microsoft Excel spreadsheets.

Enhancing Excel, with Super Keys from Virtuoso

Prerequisites:

Getting Data from Virtuoso into Excel

(1) Start your Excel instance and create a new workbook. Navigate to Data 
-> Get External Data -> From Database

(2) Choose an ODBC Data Source Name (DSN) that targets your Virtuoso instance

(3) When challenged, provide authenticate credentials for the selected ODBC Data Source.

Now that you are connected to your ODBC Data Source, the stage is set for you to construct SQL Queries that return data directly into Excel.

Injecting Super Keys into Query Results

Virtuoso makes this work by extending SQL with SPARQL; in this case, by accepting SPARQL queries in the SQL FROM clause. Fundamentally, this is simply a case of blending entity relationship types represented as “Records in a Table” (SQL) with entity relationship types represented as “RDF sentence/statement Graphs” (SPARQL).

In the example that follows we are going to be querying against DBpedia (i.e., Wikipedia content in Structured Data form, that has been published in line with Linked Data principles).

Here is the SPASQL (SQL and SPARQL fusion) query that we will be using —

SELECT DISTINCT 
DBpedia.name AS "Actor Label",
DBpedia.actor AS "DBpedia ID",
DBpedia.school AS "School ID",
DBpedia.schoolLabel AS "School Label"
FROM
(
SPARQL
PREFIX dct: <http://purl.org/dc/terms/>
SELECT ?actor
?school
?name
?schoolLabel
WHERE
{
SERVICE <http://dbpedia.org/sparql>
{
?actor
dct:subject <http://dbpedia.org/resource/Category:Tony_Award_winners> ;
<http://dbpedia.org/ontology/almaMater> ?school ;
foaf:name ?name .
?school
rdfs:label ?schoolLabel .
FILTER ( LANG(?name) = "en" )
FILTER ( LANG(?schoolLabel) = "en" )
}
}
ORDER BY ASC(?school)
LIMIT 100
) AS DBpedia

(4) Here’s the query pasted into Excel’s SQL Query Builder.

(5) Excel showing a preview of the query result set (also known as the query solution).

For this example, keep Existing Sheet selected and click OK to retrieve the complete result set.

(6) Click the Load button to retrieve the complete result set.

(7) Having retrieved data from our ODBC Data Source, we can proceed to make the DBpedia URIs (hyperlinks) live — i.e., make them clickable links — en route to exposing Platform Agnostic Super Key functionality.

Enabling Ad-hoc Exploration

(8) Create a new column (Ex: Actor URI in this example) and use the formula builder to create Hyperlinks using the DBpedia ID column for URLS, and the Actor Label column for link text. This is best done using:

=HYPERLINK([@[URL COLUMN]], [@[LINK TEXT COLUMN]])

For this example, the statement is:

=HYPERLINK([@[DBpedia ID]],[@[Actor Label)]])

(9) Populate the remaining rows with the hyperlink formula by dragging from the first cell to the last relevant cell of the “Actor URI” column.

(10) Repeat the steps with School ID and School Label columns

Those columns also include hyperlinks. Notice the effect of labels on the aesthetics of hyperlinks that have been generated. Basically, the effect delivered to you via the <a/> (anchor) control of HTML, but within Excel.

(11) Click on a hyperlinked-name of interest e.g., “Peter Weiss” and the “Super Key” effect is invoked i.e., you end up with a page from the DBpedia database that describes the entity labeled “Peter Weiss” .

Conclusion

In this post I’ve demonstrated how Virtuoso’s ODBC Driver enhances an existing productivity tool (Microsoft Excel) by injecting Super Keys into SQL Query Results — dispatched over ODBC.

As you can see, ODBC, SQL, and Excel don’t need to be victims of costly “rip and replace” decision. Instead, you can leverage the combined prowess of Virtuoso, Linked Data, and a Semantic Web as powerful foundation for making cost-effective decisions that exploit new technology innovations using existing IT investments.

Related