How Do You Know You Need a Multi-Model Relational Database Management System (RDBMS) ?

Kingsley Uyi Idehen
OpenLink Virtuoso Weblog
6 min readJun 14, 2018
Source: https://www.linkedin.com/feed/update/urn:li:activity:6477955590613196800 — LinkedIn post by Scott Taylor

Historically, organizations have looked to Relational Database Management Systems (RDBMS) to serve the needs of their line-of-business (or systems-of-record) applications. Unfortunately, the Single-Model nature of the typical RDBMS (most commonly based on a Tabular Relational model, and queried using SQL) isn’t immediately obvious due to confusion that has arisen from the marketing-driven colloquial conflation of "SQL" with "RDBMS".

Since the emergence of the World Wide Web, and realization of the myriad ways its core architecture may be applied to enable richer interactions with data, organizations have been increasingly discovering the shortcomings inherent to their Single-Model RDBMS deployments, while simultaneously being victimized by industry hype and buzzwords that have been encouraging them to simply exchange one model for another — to rip-and-replace, and swap their Tabular RDBMS for Graph, Key-Value, document-based or other non-Tabular RDBMS.

Fundamentally, an RDBMS is a software application which may (or may not!) support SQL for declaratively performing data definition and/or manipulation operations on data represented as entity relationship types (relations).

There is no rule or reality that implicitly or explicitly limits an RDBMS to supporting one kind of relation modeling or one query language; that’s just a shorter and more pragmatic route to implementation — exploited by most major SQL-RDBMS vendors at a time when the costs of volatile and persistent storage (memory and disk, respectively) were prohibitively high (which is no longer the case today!).

What is a Multi-Model RDBMS?

Database models apart from the tabular approach include Predicate Graphs (e.g., RDF; this is a directed graph without a root) and Tree (e.g., XML; this is a directed graph with a root).

As may be obvious, a Multi-Model RDBMS should allow you to organize your data using a variety of tuple-structures that manifest as Relations (Sets of Entity Relationship Types).

Illustration by John O Gorman shared via comment thread

A Multi-Model RDBMS may support the query language associated with only one of its supported Models (e.g., SQL for SQL/Tabular Relations; or SPARQL for RDF/Graph Relations; or XPath or XQuery for XML/Tree Relations), but would optimally support multiple such languages — likely at least one per model — for declaratively performing data definition and manipulation operations on those relations.

Beyond the above, actual documents may function as data sources for this kind of RDBMS, such as —

  • CSV — Tabular Relations (Tables)
  • XML— Trees (Directed Graph with a Root), or RDF sentence/statements (Directed Graph without a Root via subject→predicate→object triples)
  • JSON — Trees (Directed Graph with a Root), or RDF sentence/statements (Directed Graph without a Root via subject→predicate→object triples)
  • RDF-Turtle, RDF-N-Triples, RDF-N-Quads, JSON-LD — RDF sentences/statements (Directed Graph without a Root via subject→predicate→object triples)

— as depicted below.

What Challenges can a Multi-Model RDBMS Solve?

1. Data Virtualization

A typical enterprise is driven by an ever-increasing plethora of line-of-business applications, each of which introduces its own preferred Single-Model RDBMS that ultimately manifests as a data silo.

A Multi-Model RDBMS has the ability to provide a conceptual abstraction that shields users and developers of newer applications and services from the underlying structural disparity across existing data silos. For instance, it can immediately reduce or eliminate expensive Data Wrangling that is otherwise necessary to —

  • Harmonize Co-references — by recognizing identifiers for the same entity across different databases through reasoning and inference, informed by the nature of various entity relationship types (e.g., equivalence, inversion, subsumption, etc.)
  • Consolidate Various Data Shapes — by presenting data as Tables, Graphs, and/or Trees, as requested and/or required by the applications and services consuming that data

2. Cost-Effective Digital Transformation

The fundamental goal of Digital Transformation is to make data access and interaction better serve the needs of an enterprise, en route to closer tracking of its ever changing agility needs — in other words, to make technology work for its users, rather than the other way around, which has been the too-common pattern to date.

Not only may a Multi-Model RDBMS harmonizes data disparity via conceptual abstraction, it also has the potential to facilitate natural interaction patterns that can lead to both serendipitous discovery of new insights and surreptitious curation of data. For instance, the familiar Create File→Save File→Share File approach to textual documents is not confined to word-processors and other text editing; rather, it can also be used to work naturally with structured data, as a complement to application-specific data entry forms.

3. Maximization of Return on Previous DBMS Investments

Most importantly of all, deployment of a Multi-Model RDBMS should not require “ripping and replacing” existing Single-Model (nor the rare existing Multi-Model) RDBMS installations that have driven — and may still drive — mission-critical business operations. It should simply enable a company to make better use of such older systems, by providing middleware shims that enable them to feed newer solutions, all aimed at delivering systems-of-intelligence that handle —

  • the temporal nature of entity-state, via reasoning and inference — that is, the evolution of an entity from state to state, like “suspect” to “evaluator” to “customer”, or “customer” to “partner”, or even “partner” to “competitor”
  • temporal queries that can provide rich insights — that is, queries that have a timeframe context (e.g., describe contact_x circa June 2014)

Why Virtuoso is the Multi-Model RDBMS for You!

First shipped in 1998, OpenLink Virtuoso is a time-tested product, uniquely designed from our deep understanding of the challenges presented by heterogeneously-shaped and otherwise disparate data sources.

Virtuoso’s Technical Architecture

Virtuoso’s key Multi-Model features include support of:

  • Conceptual Virtualization of Entity Relationship Types (Relations) — local and remote data represented as SQL Tables or as RDF Graphs may be virtualized as either or both
  • Both SPARQL and SQL — for declarative operations on data represented as SQL Tables and/or as RDF Graphs combined with coherent integration of XQuery and XPath into either language
  • SPASQL — a hybrid implementation of SPARQL-within-SQL, enabling traditional SQL-based tools to operate on RDF Graphs without direct modification of those tools
  • Attachment of 3rd party ODBC- or JDBC-accessible data sources — via drivers (a/k/a connectors or providers) (note — attachment of JDBC data sources requires use of an ODBC-to-JDBC Bridge)
  • Mapping data files (CSV, TTL, etc.) to SQL Tables or RDF Graphs — including those spread over conventional or distributed file systems
  • Uncompromised yet Maximally Flexible Security — via sophisticated attribute-based access controls (ABAC) rather than simple role-based access controls (RBAC)
  • Existing Open Standards (ODBC, JDBC, ADO.NET, OLE DB, SQL, SPARQL, HTTP, R2RML, SPIN, etc.) — employed to deliver all of the above
  • High-Performance and Scalability — as a foundation of all of the above

This unparalleled combination of pedigree and features assures that all users will benefit substantially by adding Virtuoso to their toolkit for data access, management, and integration.

Related

--

--

Kingsley Uyi Idehen
OpenLink Virtuoso Weblog

CEO, OpenLink Software —High-Performance Data Centric Technology Providers.