Having your Node.js Cake and Eating It Too

This post was written by Rob Tweed who is the director of M/Gateway Developments Ltd, a consultancy and software development company in the UK that has focused on web and NoSQL database technologies, particularly in the healthcare industry, since the mid-90s. Cycling, photography and listening to and recording music are what keep him sane away from the keyboard! This post first appeared on Rob’s blog.

Imagine all the benefits of Node.js: one language and technology for both front-end and back-end development, plus its outstanding performance; BUT without the concerns of concurrency and heavy CPU processing, AND with high-level database abstractions: with some interesting parallels to Amazon Web Services’ Lambda, that’s what the QEWD.js framework is designed to deliver

I’ve worked with Node.js since its early days in 2011. I’ve also worked for many years more with conventional server-side languages, so I’m aware of the differences with the Node.js philosophy, and with what I’d like to do versus how Node.js wants/expects me to do it. Additionally, I’ve worked recently with Java developers who have made (or tried to make) the transition to Node.js, which has been an interesting and revealing exercise.

Whilst Node.js has become hugely popular, it is not without its many critics. Probably most of the criticisms centre around things that are the very consequences of the deliberately-chosen technical design of Node.js: namely that all user activity takes place within a single process. So, when writing server-side code in JavaScript, Node.js crucially requires you to understand that everything you do can affect every user, and expects you to write your code accordingly. Node.js is therefore all about asynchronous coding and non-blocking I/O. Block or even slow down the process and all other concurrent users suffer and you can bring a service to its knees.

Whereas other, more conventional server-side languages such as Java and Python provide optional syntax to perform asynchronous logic where it makes sense and is more efficient to do so (e.g. to access multiple remote services in parallel), the norm in those languages is to write synchronous logic and even when accessing databases or files. The multi-threaded nature of these languages’ technical architectures means that the developer doesn’t have to be concerned about concurrency. So when developers with a background in languages such as Java or Python are faced with moving to the single process environment of Node.js, it’s unavoidable and mandatory asynchronous logic comes as quite a culture shock.

Some learn to love Node.js, and some grudgingly accept it, but many just don’t “get it” at all and give up, and many others dislike it with a vengeance. That’s a problem if Node.js is to continue growing in popularity: if it’s to extend further into the Enterprise, it’s going to require developers who currently use Java, Python, .Net, etc. to comfortably migrate to and adopt Node.js and JavaScript.

Of course, recent developments in JavaScript have tried to make life easier for the developer: first in the form of Promises, and more recently in the form of Async/Await. These syntax enhancements aim to provide a more synchronous and therefore intuitive feel to asynchronous logic. Nevertheless they’re not the complete answer. The fact that all users are being handled by the one process means you can still bring a Node.js application to its knees with CPU-intensive code: something that understandably rings alarm bells when Node.js is considered for the Enterprise.

As a result, numerous articles have been written that recommend the use of Node.js for only certain kinds of application. One such article by Tomislav Capan is pretty typical, suggesting: “Where Node.js really shines is in building fast, scalable network applications, as it’s capable of handling a huge number of simultaneous connections with high throughput, which equates to high scalability“. Like many others, he concludes:

  • You definitely don’t want to use Node.js for CPU-intensive operations; in fact, using it for heavy computation will annul nearly all of its advantages
  • The [WebSocket-based] chat application is really the sweet-spot example for Node.js: it’s a lightweight, high traffic, data-intensive (but low processing/computation) application that runs across distributed devices
  • If you’re receiving a high amount of concurrent data, your database can become a bottleneck. He recommends that data gets queued through some kind of caching or message queuing (MQ) infrastructure (e.g. RabbitMQ, ZeroMQ) and digested by a separate database batch-write process, or computation intensive processing backend services, written in a better performing platform for such tasks
  • Don’t use Node.js for server-side web applications with relational databases (use Rails instead)
  • Don’t use Node.js for computationally heavy server-side applications

All well and good, but I would like to be able to have my cake and eat it too:

  • I’d like to just use one language — JavaScript — for everything
  • I’d like to avoid a mash-up of a separate message queue such as RabbitMQ and multiple languages. The less complexity and the fewer moving parts the better from the point of view of maintainability and stability.
  • In my experience it’s almost impossible to avoid some amounts of CPU-intensive processing on the server-side of most web applications, so I’d like to be able to handle such processing without fear of grinding a Node.js application to a halt for everyone.
  • I’d also like to not have to worry about concurrency, and write my “userland” code as if it wasn’t an issue. I know that this is the promise (pun not intended) of Async/Await, but such pseudo-synchronous syntax still limits my ability to write, for example, higher-level, properly-chainable database functions in JavaScript. In my opinion, JavaScript should be just as capable as Rails for handling relational databases, and being able to create higher-level database abstractions in JavaScript is a key step to achieving this.

I’m sure I’m not alone in having this wish-list. So, a question I had from my earliest days of using Node.js was: Couldn’t it possible for me to have my cake and eat it, and get all the advantages of Node.js and avoid all the downsides?

Interestingly, we’ve seen the emergence of one use of Node.js where this is the case, and even its creator, Amazon Web Services, seems unaware that this is what they’ve made possible. Their Lambda service provides what is referred to as a “serverless” environment — more accurately a “function as a service” environment. You create and upload a function, and it is run on demand by services and technical means you neither know nor care about, and you simply get charged per invocation of that function. The first language offered for Lambda was Node.js/JavaScript, and although you can now use other languages including Java, Node.js is still the primary offering.

What sets Lambda apart from the normal Node.js environment is that your functions are executed in an isolated runtime container where they don’t compete for any other users’ attention, so concurrency isn’t actually an issue. Nevertheless, look at the published example functions and they all use the usual asynchronous logic.

That doesn’t make sense to me. It’s fair enough to use asynchronous logic if it makes sense or is more efficient to do so, such as when you’re making multiple, simultaneous requests to remote S3 or EC2 services. However, for many Lambda functions you’ll maybe making a few accesses to remote resources which, if they could be done truly synchronously, wouldn’t affect performance or cost, but conversely would simplify the logic considerably. Put it this way: no Java, Python or .Net developer that I know of would go out of their way to use asynchronous logic if they didn’t have to, so why should a Node.js developer?

Of course one of the reasons why Node.js Lambda developers continue to use asynchronous logic is that they believe there’s no alternative: pretty much all the standard interfaces for databases and remote HTTP-based services are asynchronous. Until things like Lambda came along, there was no point in having synchronous APIs for Node.js. Hopefully that can and will change. For example, the tcp-netx module, which provides synchronous as well as asynchronous APIs for basic TCP access, ought to provide the underpinnings for a new breed of synchronous APIs for use in a Node.js environment such as Lambda, where concurrency isn’t an issue. Indeed there’s already such an interface available for MongoDB.

Not everyone, of course, will want to move their applications to Amazon’s “serverless” Lambda service. Prevailing wisdom would suggest that it’s not possible for them to “have their cake and eat it too” , but actually that’s not entirely true. Take a look at a Node.js project known as QEWD.js and you’ll see a way to achieve something similar to Lambda’s isolated execution containers, but running on your own servers.

QEWD.js is a server-side platform for REST and browser-based applications, built on top of a module called ewd-qoper8 which implements a Node.js-based message queue. Incoming messages to ewd-qoper8 are queued and dispatched to pre-forked Node.js child processes for processing. However, the key, unique feature is that each child process only handles a single message at a time, so the handler function for that message does not need to be concerned about concurrency: like Lambda, the handler function is executed in an isolated runtime environment. After handling the message and returning the response to the master ewd-qoper8 process, the child process does not shut down, but immediately makes itself available to handle the next available message in the queue. So there are no child process start-up and tear-down costs.

When developing ewd-qoper8 I looked at the possibility of using one of the standard message queues such as ZeroMQ or RabbitMQ, but found that there were no benefits in doing so. ewd-qoper8 turns out to be a very fast and reliable message queue, and allows me to avoid a mash-up of technologies and moving parts, and instead implement everything in Node.js and JavaScript.

QEWD.js builds on top of ewd-qoper8, integrating its master process as an Express middleware to provide a complete back-end development environment for web applications and REST/Web Services. A pretty good analogy of QEWD.js is a Node.js-based equivalent to Apache & Tomcat. QEWD’s fully asynchronous, non-blocking master process, incorporating Express, socket.io and the ewd-qoper8 message queue is, in many ways, a perfect Node.js networked application: it’s really lightweight, doing little else than ingesting incoming HTTP and WebSocket messages, putting them on a queue and dispatching them to an available child process. It’s therefore capable of handling large amounts of activity. All the “userland” processing happens in the isolated environment of a separate child process. QEWD allows you to configure as many child processes as you wish to meet the demands of your service and to make optimal use of your available CPU cores. If a back-end message handler function uses synchronous logic and blocks the child process, it affects nobody else. If it uses a lot of CPU, then it doesn’t directly affect any other concurrent user, any more so than in, say, a Java or .Net environment. Meanwhile, the master process continues to ingest, queue and dispatch incoming messages unabated.

Therefore with QEWD, I feel I have my ideal environment:

  • I just have one technology — Node.js — for the entire back-end.
  • I use just one language — JavaScript — for everything: front-end and back-end.
  • As a developer I don’t have to worry about concurrency. That’s all handled for me by the QEWD/ewd-qoper8 master process which is just a “black box” that handles the external-facing HTTP and WebSocket interface as far as I’m concerned. My code will be executed in an isolated Node.js run-time container that has its entire process to itself, so I don’t need to worry about blocking I/O or CPU intensive processing.
  • I can and still do use asynchronous APIs, but only where it makes sense and is more efficient to do so. But for most of the time I can access resources such as databases synchronously, which makes my logic simpler, more intuitive and therefore more maintainable.
  • I can build powerful higher-level database abstractions entirely in JavaScript, so I don’t have to resort to using other languages and mixed-technology environments for this area of work. For example, the ewd-redis-globals module is used by QEWD to abstract the Redis database into not only a Document Database, but also a very powerful, high-level concept that I call Persistent JavaScript Objects that can be manipulated and modified directly within the database.

In many ways the “proof of the pudding” with QEWD.js has been to watch how Java developers take to it. I’ve been very encouraged by their reaction. Yes, they need to learn the differences in syntax of JavaScript and its many quirks, but otherwise they’ve told me they like the way their code runs in a much more familiar way and they don’t need to worry about concurrency.

If you’re interested in finding out more about QEWD.js, there’s a pretty comprehensive online training course available on Slideshare. QEWD.js is an Apache 2-licensed Open Source project, and will run on all platforms (even a Raspberry Pi). It’s built around the best of breed Node.js modules such as Express and socket.io. It works with any front-end JavaScript framework including Angular, React.js and React Native. You can use any standard Node.js modules in your back-end message handler functions, and any database using either conventional asynchronous interfaces or, ideally, synchronous ones.

I think that the time has come to begin to question the conventional wisdom regarding Node.js. Amazon Web Services’ Lambda and the QEWD.js project are challenging the ideas about the types of task for which Node.js is best avoided, providing solutions to what were previously seen as deficiencies without the need for other technologies and languages, and changing how server-side JavaScript can be written. I’m not saying that Lambda and QEWD.js will suit everyone or fit all use-cases, but they add a new dimension to and new opportunities for Node.js.

Yes, I like to think you can now have your Node.js cake and eat it too.

*A special thank you to technical editor Simeon Vincent for reviewing this post.