Nodebooks: Introducing Node.js Data Science Notebooks
Python and Node.js in the same Jupyter notebook (part 1)
I am a developer, as in computer code. My job is to persuade computers to do my bidding by typing gibberish into a text file and presenting its contents to the computer like a sacrificial oblation.
The contents of the text files have changed over the years as my computer and I have communicated in several languages: BASIC, Pascal, C, C++, Forth, Java, Objective-C, PHP, Python. But the language we share most often these days is JavaScript, either inside web pages or to run server-side apps and command-line tools using Node.js.
If I had a gun to my head and had to program my way out of it (which is, let’s face it, unlikely), I’d choose Node.js. It’s the language I have to Google least to remember the syntax.
Editor’s note: Parts 1, 2, and 3 published in fall 2017. A more recent 2018 article updates the variable-sharing features described in part 2 of this series:
Notebooks
Notebooks (that’s Jupyter/IPython Notebooks, not Moleskine® notebooks) are where data scientists process, analyse, and visualise data in an iterative, collaborative environment. Like other developers, I am not a data scientist, but I do like the idea of having a scratchpad where I can write some code, iteratively work on some algorithms, and visualise the results quickly.
To that end, David Taieb and I created pixiedust_node, an add-on for Jupyter notebooks that allows Node.js/JavaScript to run inside notebook cells. It’s built on the popular PixieDust helper library. So let’s get started!
Installing
Install both pixiedust
and pixiedust_node
using pip, the Python package manager. In a Jupyter Notebook cell:
!pip install pixiedust
!pip install pixiedust_node
Using pixiedust_node
Now we can import pixiedust_node
into our notebook:
import pixiedust_node
And then we can write JavaScript code in cells whose first line is %%node
:
%%node
var date = new Date();
print(date);
// "2017-05-15T14:02:28.207Z"
It’s that easy! We can have Python and Node.js in the same notebook. Cells are Python by default, but simply starting a cell with %%node
indicates that the next lines will be JavaScript.
Printing JavaScript variables
Calling the print
function within your JavaScript code is the same as calling print
in your Python code.
%%node
var x = { a:1, b:'two', c: true };
print(x);
// {"a": 1, "b": "two", "c": true}
Using PixieDust display() to visualise data
You can also use PixieDust’s display
function to render data graphically:
%%node
var data = [];
for (var i = 0; i < 1000; i++) {
var x = 2*Math.PI * i/ 360;
var obj = {
x: x,
i: i,
sin: Math.sin(x),
cos: Math.cos(x),
tan: Math.tan(x)
};
data.push(obj);
}
display(data);
PixieDust presents visualisations of data frames using Matplotlib, Bokeh, d3, Google Maps and, MapBox. No code is required on your part because PixieDust presents simple pull-down menus and a friendly point-and-click interface, allowing you to configure how the data is presented:
Adding npm modules
There are thousands of libraries and tools in the npm repository, Node.js’s package manager. It’s essential that we can install npm libraries and use them in our notebook code.
Let’s say we want to make some HTTP calls to an external API service. We could deal with Node.js’s low-level HTTP library, or an easier option would be to use the ubiquitous request npm module.
Once we have pixiedust_node set up, installing an npm module is as simple as running npm.install
in a Python cell:
npm.install('request');
Once installed, you may require the module in your JavaScript code:
%%node
var request = require('request');
var r = {
method:'GET',
url: 'http://api.open-notify.org/iss-now.json',
json: true
};
request(r, function(err, req, body) {
print(body);
});
// {"timestamp": 1494857069, "message": "success", "iss_position": {"latitude": "24.0980", "longitude": "-84.5517"}}
As an HTTP request is an asynchronous action, the request library calls our callback function when the operation has completed. Inside that function, we can call print
to render the data.
We can organise our code into functions to encapsulate complexity and make it easier to reuse code. We can create a function to get the current position of the International Space Station in one notebook cell:
%%node
var request = require('request');
var getPosition = function(callback) {
var r = {
method:'GET',
url: 'http://api.open-notify.org/iss-now.json',
json: true
};
request(r, function(err, req, body) {
var obj = null;
if (!err) {
obj = body.iss_position
obj.latitude = parseFloat(obj.latitude);
obj.longitude = parseFloat(obj.longitude);
obj.time = new Date().getTime();
}
callback(err, obj);
});
};
And use it in another cell:
%%node
getPosition(function(err, data) {
print(data);
});
// {"latitude": 50.5736, "longitude": -99.3493, "time": 1494422942373}
Promise me a miracle
If you prefer to work with JavaScript Promises when writing asynchronous code, then that’s okay too. Let’s rewrite our getPosition
function to return a Promise. First we're going to install the request-promise
module from npm:
npm.install( ('request', 'request-promise') )
Notice how you can install multiple modules in a single call. Just pass in a Python list or tuple.
Then we can refactor our function a little:
%%node
var request = require('request-promise');
var getPosition = function(callback) {
var r = {
method:'GET',
url: 'http://api.open-notify.org/iss-now.json',
json: true
};
return request(r).then(function(body) {
var obj = null;
obj = body.iss_position
obj.latitude = parseFloat(obj.latitude);
obj.longitude = parseFloat(obj.longitude);
obj.time = new Date().getTime();
return obj;
});
};
And call it in the Promises style:
%%node
getPosition().then(function(data) {
print(data);
});
// {"latitude": 20.7734, "longitude": -81.5809, "time": 1494857142842}
Or call it in a more compact form:
%%node
getPosition().then(print);
// {"latitude": 20.7734, "longitude": -81.5809, "time": 1494857142842}
Next time
In the next part of this three-part series, we’ll look at sharing variables between Node.js and Python code and interacting with databases from our notebook.
Links
- sample notebook — https://github.com/ibm-watson-data-lab/nodebook-code-pattern
pixiedust
– https://github.com/ibm-watson-data-lab/pixiedustpixiedust_node
– https://github.com/ibm-watson-data-lab/pixiedust_node