Today we’ll be talking about some of the differences in performance between Nexus and Node, and I will answer some questions I’ve received over the past week.
If you don’t know what Nexus.js is, please read this first:
The mantra of Nexus.js is simple: Utilize everything.
Not a single CPU cycle on any available CPU core is to be wasted. If there are more tasks queued and we’re below the maximum concurrency threshold, then we start a new thread. If there are no tasks available, we exit the thread. If all threads exit, we exit the program.
At the centre of it all lies the scheduler, which makes this happen:
It’s as simple as that!
What this means in practice is that we have a dynamic thread-pooling model, where the system adjusts to the workload it is given within its capacity.
This way, a system with 16 cores will not be using 32 threads all the time, thereby wasting system resources and CPU cycles doing nothing; whilst at the same time, if such a system is hit by 1000 requests per second, all 32 threads will spin up and start aggressively serving requests as soon as they arrive. You can control the number of maximum threads using the ‘concurrency’ command line flag.
It’s even possible to run Nexus with a single thread by passing a ‘concurrency’ value of 0, whereupon it will act like Node, emulating a single-threaded event loop; albeit a little slower, since it won’t perform any asynchronous I/O in the background.
So, what difference does this make? Node can scale using the Cluster API easily.
While that’s true, it won’t be utilizing hyper-threading (or logical processors) to the maximum extent, you will also have to multiply your application’s memory requirements by the number of instances you start.
Let’s say your application consumes 700 megabytes of memory at its peak capacity.
Now multiply that by 4. That’s 2.8 GB of RAM just to utilize four cores. On a small VPS, this means a lot.
Imagine a real-world application that consumes 4GB of RAM to serve live video streams. If you try to scale that using Node’s Cluster API, you’ll have to multiply 4GB by the number of cores.
So let’s say you deploy it on 16 cores. That would be 64GB of RAM, just to utilize half the CPU capacity, with no means of utilizing the other half. (If you take logical processors into account)
Now imagine that same scenario with Nexus. You’ll start a single instance, which will consume… let’s say 6GB of RAM. It will start 32 threads, and start serving requests.
The number of threads will adjust dynamically, and if you’re not serving a high load, some of the threads will exit, thereby saving some system resources. (Not easily done with Node’s Cluster API as far as I know)
The threads will share the memory, and depending on your implementation, you can even reduce memory usage further.
Instead of 16+ open connections to your database server, you’ll have one (it ultimately depends on your pooling solution, but you can now at least control how many connections you want).
Most important of all: Nexus will utilize logical processors to their maximum capacity.
Here’s a basic explanation of how that works:
LLInt, short for Low Level Interpreter, executes the bytecodes produced by the parser.
Baseline JIT kicks in for functions that are invoked at least 6 times, or take a loop at least 100 times (or some combination — like 3 invocations with 50 loop iterations total).
DFG JIT kicks in for functions that are invoked at least 60 times, or that took a loop at least 1000 times.
FTL JIT kicks in for functions that are invoked thousands of times, or loop tens of thousands of times. See FTLJIT for more information.
Each of the JIT compilers is invoked to optimize and re-optimize the code depending on the criteria above. Read more about that here.
What this means for Nexus is absolute superiority when it comes to runtime code optimization. It also means no start-up overhead for simple cases.
Okay, so some of you were wondering how all variables worked atomically in Nexus. Here is how, for the technically inclined:
Locking in WebKit
Back in August 2015 we replaced all spinlocks and OS-provided mutexes in WebKit with the new (WTF stands for Web…
I won’t get into details, the article explains it better than I ever could.
For that we have NX::Globals::Promise::createPromise which is essentially a wrapper around new Promise(executor). It’s too big to post here, and contains a massive amount of C++ code that makes it possible, but here is how to use it:
You’ll notice that I’m using C++14 lambdas. I always use them when possible. They make code a lot simpler.
Package Manager and require()
Yes, I plan on implementing a package manager in the future. It will look a lot like JSPM.
As for require(): I don’t plan on implementing it. Nexus will be module-based, it will use the ES6 import statement, as well as System.import().
I would still like your opinion on this. Is require() really needed?
Do you have more questions? I would be happy to answer them. Send them here in the comments or to @voodooattack on Twitter, and as always: you can review the code for the project on GitHub.
Next: Part VI: Server