Jupyter Notebooks with Elixir and RDF

Using IElixir in JupyterLab with the SPARQL.Client package

“white bridge with lights” by Tyler Easton on Unsplash

Now one can’t really claim to be a bona fide data scientist these days if one isn’t brandishing some flavour of notebook to save and run one’s calculations. And arguably one of the best known names in this breed of application is the Jupyter Notebook which is an excellent vehicle for data processing with Python. And Python as a language has become almost synonymous with data wrangling.

But these days there’s much more to Jupyter than the original name begetting trio of languages – Julia, Python, and R. And also notebooks themselves are useful for much more than just data processing alone, e.g. iterative code development and exploration, and multimedia code description and sharing.

So in this post we’ll take a quick look at the IElixir Jupyter kernel from Piotr Przetacznik which will allow us to explore Elixir from within a rich media, interactive compute framework. And following on from some recent posts we’ll use the package for Elixir for querying RDF datastores.

Jupyter Notebooks

Jupyter Notebooks (or formerly IPython Notebooks – and hence the usual file extension ) are computational notebooks which integrate live code, text, math and media and are the result of a long line of development. Since Project Jupyter was started up in 2014 the use of Jupyter Notebooks has all but exploded. And just as recently as yesterday, Nature magazine published an article ‘Why Jupyter is data scientists’ computational notebook of choice’ in which it reports that were ‘more than 2.5 million public Jupyter notebooks in September 2018, up from around 200,000 or so in 2015’.

The notebook pedigree goes at least as far back as the literate programming notions first introduced by Knuth with his WEB system in which he famously programmed TeX. This approach pairs off a document formatting language with a programming language so that one source file may be unbundled either as a document for reading or as a program for executing.

Later developments led to the nascent notebook format itself by which Mathematica, Maple and MATLAB supported interactive numerical and mathematical computations.

More recently the genre has been picked up and expanded by the data science community using IPython and data manipulation with packages such as . And IPython has been subsequently generalized into Jupyter to support multiple compute engine backends, initially with support for the core langauges of Julia, Python and R. There are now dozens of Jupyter kernels available with most major programming languages represented.

And just earlier this year the Jupyter Notebook interface itself has been been significantly elaborated and upgraded into the much more powerful and extensible JupyterLab.

In essence with notebooks we have something akin to a webified command loop (or, as it’s sometime known, REPL) which can be annotated with rich text, math and media and which can be replayed on a cell-by-cell basis or as a whole. The inputs (and outputs) are saved as JSON files which can be saved for editing and playback and shared directly with other users or hosted for wider community access. The kernels, or compute engines, can either be locally or remotely hosted. The schematic below gives a general idea of the overall architecture.

From Jupyter documentation.

IElixir and installation

IElixir is a Jupyter kernel specifically built for running Elixir programs, or program fragments.

See the IElixir project for information and help on installation. You would normally need to have both JupyterLab and Elixir pre-installed.

As an alternative, note that there is a docker image available. I have not tried this myself but there is additional information here on the IElixir project.


Now for this walkthrough we will use the project developed earlier in this series:

And we will also be walking through the notebook which accompanies this post.

For this walkthrough we will be using JupyterLab as our notebook environment. So, we’ll assume now that we already have JupyterLab installed. And we’ll also either download the project locally or just follow along with the GitHub rendition.

We can invoke JupyterLab from the command line as:

This is our main window on opening JupyterLab and navigating to the project home page using the file explorer, from which we can then open up the notebook in .

Let’s hide the file explorer by toggling the tab and just focus on the notebook pane itself:

We’ll follow the plan outlined in the notebook here.

1. Creating the project

I’ve first raised a couple caveats relating to my own specific lack of knowledge in how best to interact with an Elixir project. (And hopefully somebody can set me straight on this.) We’ll start by copying the project developed in the earlier post into a new project so that we can make any adjustments necessary.

2. Setting up the environment

IElixir uses the concept of virtual environments for managing packages. It uses as its package manager.

We use to install the package and then check the modules with the function.

3. Running basic SPARQL.Client queries

Before going further we’ll just try out a basic SPARQL query to retrieve one RDF triple. And this indicates that there is a problem for this service using the default POST method with URL escaping.

Instead, we need to call the function with the GET method. This time around it works fine. And if we run it again and capture the function return we can inspect the SPARQL query result set.

Well, OK, that works now. So let’s try a more meaningful query.

4. Installing our TestQuery modules

Let’s first load our module after confirming with the function that none of its functions are available. We use the to directly load the module.

This fails and if we look into the source in we can see that the call is failing since the application is unknown. (See the caveats raised earlier.)

Replacing that with the relative path to the directory in our project we can try importing again. And this time it works as confirmed by checking the exported functions.

The same thing happens when we attempt to import the module. This time the module attribute needs to be replaced.

Replacing that with an absolute path to the directory in our project we can try importing again. And again this time it succeeds as confirmed by checking the exported functions.

Now we’ve successfully imported our modules but as we noted earlier we are going to have to update our query functions to use a GET method with the right SPARQL protocol. We’ll do this by introducing a new module attribute which sets the keywords and :, and we’ll add this module attribute to all our calls.

We’re good to go.

5. Testing it out

To test out our SPARQL client functions let’s first import the modules for ease of use. (Strictly we only want the module, but no matter.)

We’ll first try out our function which queries DBpedia for the ‘Hello World’ resource and returns the English language label. This is returned as a list containing an literal struct but is easily unpacked by accessing the struct field.

And as that works, we’ll follow up with one of our stored queries and use the function on that. Again we can return a list containing an literal struct.

And finally we can try out the demo function which preforms multiple queries for some 2018 Atlantic hurricanes and stores the results in separate ETS tables. These tables can be inspected using the Oberver tool, as described in the earlier post. (In short the call opens up the Observer interface in a standalone window and by selecting the ‘Table Viewer’ tab one can see the ETS tables we created and inspect them by double clicking the respective rows.)


I’ve shown here in this post how we can use the IElixir kernel to run a Jupyter Notebook with Elixir as our language of choice.

Specifically we have taken the module developed in an earlier post and shown how this can be used within a Juyter Notebook context. We are able to run SPARQL queries using the package directly from the notebook and to process the result sets. Moreover, since we are using a notebook we can better experiment with and annotate our findings.

The tooling around Elixir is improving all the time and with its proven support for managing distributed compute Elixir stands out as a most fascinating and intriguing language for building semantic web applications.

See here for the (modified) project code with notebook.

This is the fifth in a series of posts. See my previous post ‘Robust compute with RDF queries’.

You can also follow me on Twitter as @tonyhammond.

Distributed data, distributed compute – the graph! | #writing, #workseeking