Visualizing SPARQL Results with Airtable

Using Airtable as a Semantic Web of Linked Data Exploration Launch Point

Daniel Heward-Mills
OpenLink Virtuoso Weblog
6 min readJan 14, 2018

--

Why is this important?

In a typical SQL RDBMS, all Keys — whether Candidate, Primary, or Foreign — are scoped to a specific RDBMS instance. Once you enter the realm of distributed databases, these identifiers suffer from diminishing utility as you move from a monolithic to a heterogeneous RDBMS setup.

What is a Super Key?

A Super Key is an identifier that unambiguously names its referent — the entity it identifies — regardless of the context in which it is encountered. When looked up (dereferenced), a Super Key always resolves to a description of its referent, without any operating system, RDBMS instance, platform, or other dependency.

On an HTTP-based network, whether public or private, hyperlinks (HTTP URIs) can serve as such Super Keys, and thereby provide a powerful mechanism for enhancing Reports and Dashboards produced by Business Intelligence, Analytics, and other oriented productivity tools associated with Data Science and/or Informatics.

This powerful capability is a result of the HTTP Fragment Identifier (“#”) which enables an HTTP URI to identify anything, rather than being restricted to documents as in the case of of HTTP URLs.

Source: https://www.slideshare.net/fabien_gandon/wimmics-research-team-overview-2017/21

Super Keys and Business Intelligence & Analytics

Reports and Dashboards are fundamentally insight-oriented projections of data in context. Using hyperlinks — Super Keys — as entity names enables every reader to perform additional ad-hoc exploration without the need to create yet another report or write custom code.

OpenLink Virtuoso, a multi-model RDBMS, adds powerful identification and exploration capabilities to existing Business Intelligence or Analytics tools, as long as those tools support open standards such as HTTP, ODBC, JDBC, ADO.NET, or OLE DB .

The rest of this post demonstrates the use of Virtuoso to inject Super Keys into Airtable reports.

Enhancing Airtable, with Super Keys from Virtuoso

Prerequisites:

  • A free or premium account on Airtable
  • Access to a public or private SPARQL endpoint

Creating a CSV Document with Virtuoso for Airtable Public

We will use a Virtuoso-generated CSV document for this exercise as neither ODBC nor JDBC connections are currently supported by Airtable, which is typical of a cloud hosted application.

[1] Navigate to DBPedia’s public SPARQL endpoint and enter the query below (or use this pre-populated form):

[2] Ensure that the Results Format menu selection is HTML (Faceted Browser Links), and click the Run Query button.

[3] An HTML table will be populated with the SPARQL query’s results.

[4] Click on any ActorID URI (for example, http://dbpedia.org/resource/Al_Pacino) to confirm that the URIs resolve. Each URI will open a page from DBpedia that describes that actor using a collection of attribute and value pairings, as depicted below.

[5] Use the Back button in your browser to return to the query editor, and change the Results Format menu selection to CSV; then click Run Query.

[6] A CSV document populated with results from the SPARQL query will be either displayed in your browser or saved to your normal downloads folder, depending on your browser’s capabilities and settings.

Airtable Data Usage

[7] Bring up Airtable in your browser, click to the Bases section, and click New Table Import from Spreadsheet.

[8] Paste your CSV data, or choose a CSV document on your local machine or an Airtable-supported host (e.g., Google Drive).

[8a] Note: Virtuoso also enables you to leverage WebDAV to access your CSV document, if it’s been stored on a host that’s not Airtable-supported.

CSV Document in a local folder mounted via WebDAV scoped to a Virtuoso instance

[9] Click on the newly generated icon for your Spreadsheet (here, it’s the blue box labeled with "Ac").

[10] Because Airtable considers ActorID to be the Primary Field (more commonly called a Primary Key) that uniquely identifies records in this document, the HTTP URIs are not clickable as such. To enable this functionality, we will need to copy these values to a new column. Right-click (or control-click) on the ActorID column name, and select Duplicate field.

[11] Ensure Duplicate cells is selected, and click Duplicate field.

[12] Once the duplication process is completed, click the header of the ActorLabel column, and copy the entire column to your clipboard.

[13] Click the header of the ActorID column and click Paste. Confirm the content replacement when prompted.

[14] All of the cells in the column will be replaced, including the column name. Right-click (or control-click) on the ActorLabel column header, and choose Delete field.

[15] Double-click on the ActorID column header; select URL from the drop down menu; and click Save.

[16] Double-clicking on a cell within the ActorID column will now open a new browser tab containing the content of the URI described.

[17] Repeat step 15 on the SchoolID column.

[18] Double click on any URI in the SchoolID column to confirm that it resolves to a DBPedia page.

Final Product:

Airtable Query Result that include DBpedia Super Keys

Conclusion

This post demonstrates how Virtuoso can be used to enhance an existing productivity tool (in this case, Airtable) by injecting Super Keys (in the form of DBpedia identifiers [HTTP URIs]) into SPARQL Query Results saved to a CSV document.

This evolves Airtable into a productivity tool for non-programmers that provides a Semantic Web of Linked Data exploration launch-point as an added enhancement to its built data visualization functionality.

Finally, this exercise demonstrates how an existing end-user productivity tool can harness the power of a Semantic Web of Linked Data without the need for expensive and failure-prone “rip and replace” misadventures.

Related

--

--