Concurrent JavaScript: The Talk

9 min readSep 9, 2016

Nordic.js 2016 * Abdullah Ali - The Mad Cousin - Multi-threaded JavaScript - YouTube

Abdullah Hassan shared a video

www.youtube.com

First of all, I want to thank the Nordic.js team for their hospitality, and I want to thank every single person I’ve met so far! Thank you for being supportive and awesome!

If you’ve attended Nordic.js 2016 or watched the live stream, you’ll know how I struggled with my presentation.

I‘d never realised that I had a problem facing crowds until that moment of my life. It was my very first public talk and I sort of stood frozen, faced with the enormity of my situation. When my laptop was hooked up to the screens on stage before the talk started, it wouldn’t work, so we had to use a screen mirroring setup so that my presentation slides had to be always visible on the screen, and thus, I did not have access to my notes while talking.

That was my biggest mistake. Lesson learnt!

And so, I have decided to publish what I could not speak aloud, so that at least you might read what I had to say but could not express at the time.

The slides are now publicly available here:

MultithreadedJavaScriptNexus.js - Presentation by voodooattack

github.com/voodooattack/nexusjs Multithreaded JavaScript With Nexus.js September 2016 NORDIC.JS NODE Uses Google's V8…

www.canva.com

Good afternoon everyone! My name is Abdullah Ali. This is my first time here, and I want to thank you all for coming today.

Today I’m going to talk about my new project: Nexus.js which aims to bring multi-threading to the world of JavaScript.

But to talk about Nexus, we first have to talk about Node.

And to talk about Node, we first have to talk about JavaScript itself.

Ever since its conception, JavaScript has had a single-threaded model, and since it was designed to run in the browser — and in a single tab in most cases, this was the right choice.

Because JavaScript was originally designed to enrich the user’s browsing experience on the web, it was always strongly tied to browser UI elements, and since most platforms handle UI code through the use of a single loop (coincidentally also called the event loop) that dispatches calls and reacts to user input, JavaScript was designed to fit into that architecture, and some things, like garbage collection, lent themselves to this approach.

Because Node was designed to tun on the server, it had to bring along that part of JavaScript into a different environment, the console.

The event loop is a single-threaded construct that allows your code to have “side-effects”. Things like `setTimeout` and `setInterval` are a good example of side-effects.

Such functions break the linearity of execution. They allow you to tell JavaScript to execute a function at a later time, this is not a standard feature for most other programming languages.

Let’s look at a typical console program: it starts up, executes a single task, and then exits with a return code.

This is different in the case of JavaScript, where a programmer might schedule multiple functions to run long after the main program executes.

And so, Node does not exit while such a task is queued on the event-loop, it keeps running in a loop until the program produces no more such side-effects, only then does it exit.

Node introduces pseudo multi-threading, where it implements a thread-pool to handle asynchronous IO, the application is still single-threaded and no calls into JavaScript are made from the other threads, but any IO is performed in the background through the thread-pool, this has drastically sped up IO operations, but the thread-pool still has its limitations: it must schedule everything on the central event-loop, and so, everything must pass through this bottleneck to interact with your JavaScript code.

So how would you introduce true concurrency to such a system?

Nexus started out as a simple experiment, I asked myself: are there any JavaScript engines that allow you to call into JavaScript from multiple threads in parallel?

V8 was out of the picture, since it specifically prohibits this using something called isolates. To interact with JavaScript code required locking the entire context, so no two threads could execute JavaScript in the same context at the same time.

My second option was Mozilla’s SpiderMonkey, and I won’t go into that. Suffice to say, it went badly. Everything crashed, the heap was corrupted, and nothing worked for more than a few seconds without ultimately crashing.

At this point I was considering writing my own JavaScript runtime from scratch. Nothing I’d tried so far was satisfactory, the single-threaded event-loop was heavily ingrained into every single JavaScript engine out there, it was a design assumption of every library I came across, and they were built from the ground up this way.

Then I tried JavaScriptCore — the engine used by WebKit — and I dare-say it is the most advanced JavaScript engine I’ve ever encountered to date! And it miraculously worked like a charm!

To put it simply: everything in JavaScriptCore is atomic, which means that if two threads try to access the same variable concurrently, one will acquire it and the second will have to wait for the first to do so.

This allows us to call into any JavaScript context from multiple threads in parallel, safely, and with no fear for the stability of the program.

My next step was re-purposing the event-loop, which is a single-threaded construct (something like a while loop) into something that can work in parallel.

This is where I dug out an old implementation of a thread-pool scheduler I wrote in 2013 for an old game I was working on, it was written in C++ using boost.

The thread-pool uses a lock-free queue to schedule tasks, and being lock-free means that we don’t have to worry about synchronization or performance hits while we’re dealing with it.

Then I integrated the thread-pool into Nexus, and instead of a single-threaded while loop, it starts many loops in multiple threads, where each thread picks the first available task and executes it, while another simultaneously picks the next, and so on and so forth.

The end result is a truly parallel JavaScript run-time.

Now enough talking about the C++ side of things, I want to talk about the JavaScript side of things.

Using Node, you can either schedule tasks to start immediately on the next iteration of the event-loop using `setImmediate` or `process.nextTick`, or (classically) at a specific time or interval using `setTimeout` and `setInterval`.

In Nexus, everything is scheduled on the thread-pool, and to use it directly, you use `Nexus.Scheduler.schedule()`.

This function takes a single argument — a function to schedule — and puts it on the event queue, to be executed by the next available thread.

`setTimeout` and `setInterval` also schedule on the thread-pool’s event queue, but allow you to set a time or interval for execution.

This is the most basic and lowest-level method to access the task scheduler, but you should not have to use it directly.

Since Nexus is based on ES6, you do have a much better weapon: truly asynchronous Promises that run on all cores in parallel.

ES6 promises are, in essence, a way to represent isolated asynchronous operations.

By design, promises truly lend themselves to multi-threading, each handler is a single operation that accepts a single value, and returns a single result. Ideally, it should have no dependencies aside from the argument passed to it, and after performing an operation, it should return a single value, which is then chained to the next handler, and so on.

This leads to an interesting observation: promises running in parallel rarely interact or compete for access to resources (as long as you avoid using global or shared variables).

This gives them a very big performance advantage. When CPUs do not contend for access to resources, CPU utilization is maximized.

One aspect of a multi-threaded promise implementation is that calls to `Promise.all()` and `Promise.race()` will schedule everything at once, then wait for the results to resolve.

When you schedule 1000 promises in Node, it will actually evaluate them synchronously, one by one, on the event-loop, because of its inability to parallelise JavaScript.

With Nexus, you automatically have the benefit of how many logical processors your system is running, for instance: with eight logical processors, eight promises will resolve at a time.

And this was why I decided to implement the entire Nexus API using promises, including the Event Emitter.

The Event Emitter is a core construct of Nexus, and it is somewhat different from Node’s default event emitter implementation.

All basic functionality is the same, but the major difference is that it allows you to wait for all event handlers to finish, and even captures the results returned from each handler for you.

It does this by returning a promise when you call `emit`, this promise resolves at the moment that all handlers return, allowing you to chain events.

Another major difference is that all handlers are invoked in parallel, thus leveraging multi-threading to the fullest extent.

This gives it a performance edge, which directly impacts asynchronous IO.

IO in Nexus is a little different than Node, since it shares CPU time with JavaScript on the thread-pool.

On the C++ side, Nexus uses cooperative coroutines to do IO, it interleaves data by reading from files and sockets asynchronously and scheduling the callbacks that handle that data on the thread-pool.

This could present a problem when you’re doing a lot of IO work. Therefore, Nexus provides you with two different types of input models. The Pull model, which allows you to read data by calling a `read` method, and the Push model, which is event-based and has high throughput, and it works by calling `resume` on the input device.

But wait right there a second. What’s an input device?

The IO interface for Nexus is loosely modelled after the boost ASIO library.

In fact, a lot of the design ideas used in Nexus are modelled after boost, which provides time-proven concrete models for things like IO, networking, and IPC.

Well, here’s the premise: you have devices and you have streams.

Devices are the very basic building blocks of any IO graph. There are three types of devices: readable, writeable, and bidirectional.

Streams allow for a higher level of data manipulation. Each stream requires a device to function. There are two types of streams: readable and writeable.

A readable stream requires a readable or bidirectional device.

A writeable stream requires a writeable or bidirectional device.

Both types of streams accept filters that manipulate any buffers passing through.

There’s one thing worthy of note here: everything coming out or going into devices is binary. Nexus uses ArrayBuffer objects to pass along the packets of data.

This allows Nexus to eliminate memory copying overhead. ArrayBuffer objects take ownership of any memory buffers passed into them.

In this screen you can see a basic IO graph that reads from a FilePushDevice, and pipes the output to a FileSinkDevice through streams.

So let’s take a look at some code.

This example reads a file from disk, converts it from UTF-8 to UTF-16, and pipes the output to four output streams simultaneously.

It waits for the operation to complete, then prints the total time the operation has taken.

Whenever the ReadableStream receives a packet of data, it will apply the encoding conversion filter to it, then forward it to the WritableStreams, whereupon each of them will be writing to a different output file simultaneously.

So what’s are the big benefits of this design?

Higher performance.

Nexus.js applications can fully utilise logical CPUs.
Promises and event handlers should never contest for resources. They lend themselves to multithreading by design .

Dynamic scaling on modern hardware.

The thread-pool can start new threads on demand, and exit them when they are no longer needed.
The maximum number of threads can be controlled via a command-line argument, and defaults to the number of logical processors available.

Maximum CPU and memory utilisation.

The memory requirements to scale an application are greatly reduced.
For a ~1GB RAM application running on a 4 core system:
Node : 4 * ~1 GB = ~4GB RAM for 4 processes
Nexus : ~1 GB for a single process

Thank you for reading!

Questions?

Do you have any questions? I would be happy to answer them. Send them here in the comments or to @voodooattack on Twitter, and as always: you can review the code for the project on GitHub.