Nodebooks: Introducing Node.js Data Science Notebooks

Python and Node.js in the same Jupyter notebook (part 1)

--

I am a developer, as in computer code. My job is to persuade computers to do my bidding by typing gibberish into a text file and presenting its contents to the computer like a sacrificial oblation.

The contents of the text files have changed over the years as my computer and I have communicated in several languages: BASIC, Pascal, C, C++, Forth, Java, Objective-C, PHP, Python. But the language we share most often these days is JavaScript, either inside web pages or to run server-side apps and command-line tools using Node.js.

If I had a gun to my head and had to program my way out of it (which is, let’s face it, unlikely), I’d choose Node.js. It’s the language I have to Google least to remember the syntax.

Editor’s note: Parts 1, 2, and 3 published in fall 2017. A more recent 2018 article updates the variable-sharing features described in part 2 of this series:

Notebooks

Notebooks (that’s Jupyter/IPython Notebooks, not Moleskine® notebooks) are where data scientists process, analyse, and visualise data in an iterative, collaborative environment. Like other developers, I am not a data scientist, but I do like the idea of having a scratchpad where I can write some code, iteratively work on some algorithms, and visualise the results quickly.

To that end, David Taieb and I created pixiedust_node, an add-on for Jupyter notebooks that allows Node.js/JavaScript to run inside notebook cells. It’s built on the popular PixieDust helper library. So let’s get started!

Installing

Install both pixiedust and pixiedust_node using pip, the Python package manager. In a Jupyter Notebook cell:

!pip install pixiedust
!pip install pixiedust_node

Using pixiedust_node

Now we can import pixiedust_node into our notebook:

import pixiedust_node

And then we can write JavaScript code in cells whose first line is %%node:

%%node
var date = new Date();
print(date);
// "2017-05-15T14:02:28.207Z"
The JS code and its output, as rendered in an IPython Notebook cell.

It’s that easy! We can have Python and Node.js in the same notebook. Cells are Python by default, but simply starting a cell with %%node indicates that the next lines will be JavaScript.

Printing JavaScript variables

Calling the print function within your JavaScript code is the same as calling print in your Python code.

%%node
var x = { a:1, b:'two', c: true };
print(x);
// {"a": 1, "b": "two", "c": true}

Using PixieDust display() to visualise data

You can also use PixieDust’s display function to render data graphically:

%%node
var data = [];
for (var i = 0; i < 1000; i++) {
var x = 2*Math.PI * i/ 360;
var obj = {
x: x,
i: i,
sin: Math.sin(x),
cos: Math.cos(x),
tan: Math.tan(x)
};
data.push(obj);
}
display(data);

PixieDust presents visualisations of data frames using Matplotlib, Bokeh, d3, Google Maps and, MapBox. No code is required on your part because PixieDust presents simple pull-down menus and a friendly point-and-click interface, allowing you to configure how the data is presented:

Using PixieDust’s display UI to refine a visualisation

Adding npm modules

There are thousands of libraries and tools in the npm repository, Node.js’s package manager. It’s essential that we can install npm libraries and use them in our notebook code.

Let’s say we want to make some HTTP calls to an external API service. We could deal with Node.js’s low-level HTTP library, or an easier option would be to use the ubiquitous request npm module.

Once we have pixiedust_node set up, installing an npm module is as simple as running npm.install in a Python cell:

npm.install('request');

Once installed, you may require the module in your JavaScript code:

%%node
var request = require('request');
var r = {
method:'GET',
url: 'http://api.open-notify.org/iss-now.json',
json: true
};
request(r, function(err, req, body) {
print(body);
});
// {"timestamp": 1494857069, "message": "success", "iss_position": {"latitude": "24.0980", "longitude": "-84.5517"}}

As an HTTP request is an asynchronous action, the request library calls our callback function when the operation has completed. Inside that function, we can call print to render the data.

We can organise our code into functions to encapsulate complexity and make it easier to reuse code. We can create a function to get the current position of the International Space Station in one notebook cell:

%%node
var request = require('request');
var getPosition = function(callback) {
var r = {
method:'GET',
url: 'http://api.open-notify.org/iss-now.json',
json: true
};
request(r, function(err, req, body) {
var obj = null;
if (!err) {
obj = body.iss_position
obj.latitude = parseFloat(obj.latitude);
obj.longitude = parseFloat(obj.longitude);
obj.time = new Date().getTime();
}
callback(err, obj);
});
};

And use it in another cell:

%%node
getPosition(function(err, data) {
print(data);
});
// {"latitude": 50.5736, "longitude": -99.3493, "time": 1494422942373}

Promise me a miracle

If you prefer to work with JavaScript Promises when writing asynchronous code, then that’s okay too. Let’s rewrite our getPosition function to return a Promise. First we're going to install the request-promise module from npm:

npm.install( ('request', 'request-promise') )

Notice how you can install multiple modules in a single call. Just pass in a Python list or tuple.

Then we can refactor our function a little:

%%node
var request = require('request-promise');
var getPosition = function(callback) {
var r = {
method:'GET',
url: 'http://api.open-notify.org/iss-now.json',
json: true
};
return request(r).then(function(body) {
var obj = null;
obj = body.iss_position
obj.latitude = parseFloat(obj.latitude);
obj.longitude = parseFloat(obj.longitude);
obj.time = new Date().getTime();
return obj;
});
};

And call it in the Promises style:

%%node
getPosition().then(function(data) {
print(data);
});
// {"latitude": 20.7734, "longitude": -81.5809, "time": 1494857142842}

Or call it in a more compact form:

%%node
getPosition().then(print);
// {"latitude": 20.7734, "longitude": -81.5809, "time": 1494857142842}

Next time

In the next part of this three-part series, we’ll look at sharing variables between Node.js and Python code and interacting with databases from our notebook.

Links

--

--