Scaling Node.js with recluster

Simple multi-process services


Servers written in Node.js are typically pretty fast, in part due to its non-blocking I/O, and in part due to the very optimised v8 JavaScript engine. One of the limiting factors, of course, is that JavaScript is single-threaded, and will only make use of a single processor core. However, it is possible to take advantage of multi-core environments via the recluster library (Github: doxout/recluster, MIT).


Using recluster

Recluster builds on the internal clustering capabilities of Node.js, providing a simple API for spawning child processes and scaling out a program to all available CPU cores. It also adds some niceties, like exponential backoff and hot reloading.

var cluster = recluster("/path/to/worker.js", {
workers: 4,
backoff: 10
});
cluster.run();

This will create 4 worker processes, respawning whenever one dies (subject to any timeouts or backoff). The worker script is responsible for initialising the application and binding any sockets or interfaces. For this, it often makes sense to use Node.js domains to correctly encapsulate the application and propagate any exceptions:

var app = domain.create();
app.on("error", function(err) {
cluster.worker.disconnect();
});
app.run(function() {
http.createServer(function(req, res) {
res.end("Hello World\n");
}).listen(8000);
});

Using domains, any exceptions thrown by the service will be caught by the domain’s error handler and the worker instructed to gracefully die. This is often preferable to trying to keep the worker alive in an uncertain state.

You will notice that the server is bound inside the worker, not the parent, which implies that we attempt to bind to the same port multiple times. This would not work, of course, but Node’s clustering takes care of this for us: any code that binds to an interface or socket inside a worker actually bubbles up to the parent process; the workers will all share the same socket.

These simple alterations to any Node.js application make it easy to deploy a multi-process service, but there are some limitations.


State and shared memory

Node.js offers little in the way of shared state, and has no native shared memory model. There are a few projects that augment Node.js to provide a shared memory binding, but these tend to eschew expressiveness in favour of low-level flexibility.

Once an application is scaled beyond a single process, maintaining state becomes increasingly less trivial. In an ideal world, the application would be totally stateless, the client not concerned with which of your processes is handling its requests. This is not always practical, however: maintaining things like user sessions and connection pools falls well within the remit of a useful web application. So, what are our options?

Natively, Node.js exposes a messaging API that can be used to pass messages between processes. For very simple applications, this would mean that all the workers could be notified of a state change.

// Inside Worker
http.createServer(function(req, res) {
// do something to mutate state, and then...
process.send("State changed!");
}).listen(8000);
// Inside Parent
worker.on("message", function(message) {
// hypothetical connection pool object
pool.broadcast(message);
});

This could work very well where the scope of the application is small, but it does not guarantee state integrity and could easily cause inconsistencies in an application. If a worker misses a broadcast or is killed before it can send a message, we will have an inconsistent state that is hard to reason about, and even harder to debug.

This is not to say that inter-process messaging is not useful, but it becomes insufficient as the scope of an application grows. Indeed, once we scale from a single master process to multiple clusters on different servers, message passing will not address the problem at all.

A better alternative is to delegate application state to a shared backend. Typically, where this backend is a replacement for a in-memory state, we need the fastest option available: something like Redis, for example. (Getting started with Redis in Node.js is outside the scope of this post, but it is a fairly trivial exercise.) By doing this we are, in effect, moving state away from the application layer into a persistence layer, which has the nice bonus of cleanly separating concerns.

Having taken care of the stateful requirements of the application, we can now confidently deploy our application cluster and enjoy our new resilient, scalable, high-performance architecture.


Gary Chambers is a Software Engineer at Football Radar in London, specialising in JavaScript development. Read more about the work done at Football Radar.