Northwind SQL vs SPARQL

Marcelo Barbieri
Feb 10 · 14 min read

Introduction

The purpose of this demonstration is to allow hands-on experience on Knowledge Graph databases by executing SPARQL queries side-by-side with their SQL counterparts. It is a learning by example experience, as per the one below.

SQL

Azure Data Studio

SPARQL

Stardog Studio

Northwind is a well-known properly normalized online e-commerce database which is largely used for training purposes across many database platforms. Refer to w3schools.com for more info.

The version we are using in this demonstration has gone through some design updates and may look slightly different from the one mentioned above.

This document contains instructions on how to set up the Northwind database on the following products and platforms:

  1. SQL Server 2019 running on a Linux docker container (Relational Database)
  2. Stardog RDF Graph database* running on a Linux docker container (RDF Triplestore)
  3. GraphDB RDF Graph database* running on a desktop installation (RDF Triplestore)

The SQL Server and RDF Graph databases contain the same data.

* “These semantic technologies are the core technologies for any Enterprise Knowledge Graph (EKG)”

You only need one of the RDF Graph databases above (items 2 or 3) to be able to execute the SPARQL queries.

Northwind database

Northwind ER Diagram (relational)

Northwind Schema (RDF Graph Database)

Visualization of the Northwind Graph in Stardog.

Stardog Studio

Class Count in GraphDB.

GraphDB

Northwind Class Relationships in GraphDB.

GraphDB

An Order example:

GraphDB
GraphDB

For more information on how to explore the Northwind RDF Graph Database, please refer to the following article:

Setting up the sample databases

Setting up the databases is straightforward, which are made available in the form of Docker Linux images or local install.

The client query tools are also available on multiple platforms and installation steps are easy to follow. You should be up and running in a few minutes.

The first step is to download and Install Docker Desktop.

Setting up the sample databases on SQL Server

For the SQL Server database, you will pull a Linux image with the sample databases already configured and loaded with data.

Executing the following command from terminal/command line will pull the image and start the container for you.

docker run -it -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=Monday*1" -p 1433:1433 --name sql1 -d mbarbieri77/sqlserver2019ubuntu18:latest

The image above contains the following MS Sample Databases. We are going to use the Northwind database in our demonstrations.

  • Revised version of Northwind.
  • AdventureWorks2017
  • WideWorldImporters

Follow the instructions on the screen.

Download and Install Azure Data Studio (desktop SQL query tool).

To connect to the Northwind database in the running container, follow the next steps.

Create a new connection.

Azure Data Studio

Fill up the connection details as per screenshot below.

The password for SA user is: Monday*1

Azure Data Studio

After establishing a connection, you can browse the structure of the database in the tree view on the left-hand side.

Azure Data Studio

Execute a test query.

Azure Data Studio
Azure Data Studio

You can stop and start the container when required using the following commands, however, you don’t need to do it now, as the container is already running.

docker stop sql1docker start sql1

Note that changes to the databases running inside the container won’t be lost between restarts.

Setting up Northwind database on Stardog

For Stardog, you will pull a Linux image with the latest version of Stardog Installed and then load the Northwind data contained in a N-Triple file.

Download and Install Docker Desktop, if not already done previously.

From terminal/command line execute the following command to pull the latest Stardog image and start the container:

docker run -it -p 5820:5820 -p 5806:5806 -e STARDOG_EXT=/opt/stardog/ext -e STARDOG_HOME=/var/opt/stardog -v $STARDOG_HOME:/var/opt/stardog --name stardog1 stardog/stardog:latest

Note: port mappings represent -p <host>:<container>.

You will be prompted to answer a few terms and trial license questions before Stardog Server starts.

Note that you must enter a business email if you are not requesting an academic trial.

Install Stardog Studio (browser SPARQL query tool) on the local machine. Select the option “Access Studio Now”, and fill up the form to get access to it.

Open Studio and connect to the database in the running container, by following the next steps.

To connect to the database, fill up the connection details as per screenshot below.

Password: admin

Stardog Studio

Create a new repository called Northwind.

Select the appropriate icon on the left hand side bar, as per screenshot below.

Stardog Studio

Fill up the name of the repository and leave all the remaining configurations with their default values and click on “Create”.

Stardog Studio

Download the Northwind N-Triple file (RDF Dump) from GitHub repository, which contains the data to be loaded into the sample database.

Select “Download”, as per screenshot below and unzip the file on a local folder.

Github

Click on “Load Data” and select the N-Triple file unzipped in the previous step. Set the file format to “Turtle” and click on “Load”.

Stardog Studio
Stardog Studio

Replace the http://api.stardog.com/ namespace with http://www.mysparql.com/resource/northwind/ and add the foaf namespace http://xmlns.com/foaf/0.1/, as illustrated below.

Stardog Studio

To stop and start the Stardog container when required, use the commands below.

You do not need to execute them now, as the container is currently running.

docker stop stardog1docker start stardog1

Setting up Northwind database on GraphDB

For GraphDB, we chose a local installation, to give you more options on how to set up your RDF Graph Database.

Download GraphDB Free here.

GraphDB

Install the GraphDB desktop application and then run the App, which will open the following page on the browser:

http://localhost:7200

MacOS

Select “Create new repository”.

GraphDB

Select GraphDB Free option.

GraphDB

Fill up the create repository form as highlighted below.

GraphDB

Select the Northwind Repository.

GraphDB

Download the Northwind N-Triple file from GitHub repository, which contains the data to be loaded into the sample database.

Select “Download”, as per screenshot below and unzip the file on a local folder.

Github

Follow the steps below to import the dumpdataNTRIPLE7.nt data file.

GraphDB
GraphDB
GraphDB
GraphDB

Execute a simple test query.

GraphDB

Add the foaf http://xmlns.com/foaf/0.1/ and empty namespace http://www.mysparql.com/resource/northwind/ one by one as per illustration below.

GraphDB
GraphDB

Note that you may need to supply prefixes when executing queries, as per illustration below.

SQL vs SPARQL Queries

The queries used in this demonstration can be downloaded from github here: SQL and SPARQL.

Individual queries can be executed in Azure Data Studio by highlighting the query and clicking on the green “Run” button, and in Stardog Studio by right-clicking and choosing “Run selected” (Ctrl+Shift+E on Windows or Cmd+Shift+E on MacOS). In GraphDB you must paste and execute queries in the web console one by one.

Our plan is to keep adding queries to this list to cover the most functionality possible between these two query languages.

Note that the result sets of the initial queries are not sorted, as ORDER BY is only introduced later in this tutorial.

Basic select with specified columns (not sorted)

Filtering data

Testing for existence of rows/graph patterns

Searching by string comparison

Joins

Note that despite the results being cropped in the screenshot, they match on both queries.

Inner Join

As an exercise, try and add the “ContactName” and “Address” columns missing in the SPARQL query.

Left Join

Using logical operators

Filtering on ranges

Filtering on list of values

Working with Nulls

Sorting data

Eliminating duplicates

Column alias and string concatenation

Limiting results

Pagination

Counting

Distinct counting

Grouping and aggregating data

Recommendation (Introduction to Property Paths in SPARQL)

Refer to the end of this topic for more details on how the property path above was built.

The property path used in the query above can be easily identified in the Graph diagram below.

The following are the 6 steps taken to develop the compact version of the query above.

Step 1: Graph pattern traversing order, orderDetail, product and customer bound variables (nodes).

# List order items that contain product-61SELECT *WHERE {        ?orderDetail :hasProduct ?product ;        :belongsToOrder ?order .        ?order :hasCustomer ?customer .        FILTER (?product = :product-61)}

Step 2: Invert the hasProduct path expression to match the following direction: product → orderDetail → order → customer

# The same result is returnedSELECT *WHERE {        ?product ^:hasProduct ?orderDetail . # Invert direction        ?orderDetail :belongsToOrder ?order .        ?order :hasCustomer ?customer .        FILTER (?product = :product-61)}

Step 3: Use sequence path to omit the binding of the ?orderDetail variable.

SELECT *WHERE {       ?product ^:hasProduct/:belongsToOrder ?order .       ?order :hasCustomer ?customer .       FILTER (?product = :product-61)}

Step 4: Use sequence path to omit the binding of the ?order variable.

# All customers that bought product-61
# Distinct eliminates duplicates in case the same customer bought a product more than once
SELECT DISTINCT * WHERE { ?product ^:hasProduct/:belongsToOrder/:hasCustomer ?customer . FILTER (?product = :product-61)}ORDER BY ?product

Step 5: Remove the ?product variable and apply the filter (IRI) to the subject directly.

SELECT DISTINCT *WHERE {  :product-61 ^:hasProduct/:belongsToOrder/:hasCustomer ?customer .}ORDER BY ?product

Step 6: And finally, omit the binding of the ?customer variable and invert the full path back to product.

SELECT (COUNT (1) AS ?Count)WHERE {
:product-2 ^:hasProduct/:belongsToOrder/:hasCustomer/
^(^:hasProduct/:belongsToOrder/:hasCustomer) :product-61
}

For more details on Property Path, please refer to this tutorial.

Combining multiple result sets using Union

As an exercise, try and add the “Type” column missing in the SPARQL query.

Subqueries

The following SQL queries produce the same results as the one above.

Another example of subquery, this time using aggregate functions.

Inserting and updating data

Inserting a new customer

Checking the new customer

Updating the new customer

Checking the updated data.

Northwind RDF N-Triple File

The Northwind N-Triple file (RDF Dump) can be downloaded from GitHub repository here.

More Queries

Getting started with the SPARQL (the basics)

Reference

agnos.ai

The Enterprise Knowledge Graph Company

agnos.ai

agnos.ai is a specialist consultancy that designs and implements Enterprise Knowledge Graphs. We harness the power of semantic technology to solve your most complex enterprise data challenges.

Marcelo Barbieri

Written by

Knowledge Graph Engineer - agnos.ai

agnos.ai

agnos.ai is a specialist consultancy that designs and implements Enterprise Knowledge Graphs. We harness the power of semantic technology to solve your most complex enterprise data challenges.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store