Starting Parallel Programming In Node.js With Napa.js

Dumindu Buddhika
Code Insights
Published in
4 min readNov 4, 2017

So you are a Node.js programmer? And do you want to do some CPU intensive work inside your Node.js server? If so, this is something you need to check out.

Microsoft recently released a new node module called Napa.js on their Github. Napa.js is a multi-threaded JavaScript runtime built on V8. Napa.js was originally designed to develop highly iterative services with non-compromised performance in Bing.

Lets talk a litle about napa.js architecture. Napa.js has a concept called zone. A zone can execute a specified block of code (The same zone cannot execute multiple blocks of code, in order to do this you need to create multiple zones). A zone consists of multiple Javascript workers (Each worker is a different thread). A process may contain multiple zones. There are two types of zones in Napa.js each having specific characteristics,

  • Napa zone
  • Node zone

Each worker in Napa.js manages their own heap space. Passing values from one worker to another has to be serialized/deserialized. In Napa.js you can share data between workers by using ‘broadcast’, ‘execute’ functionality in a zone or by using Store API. When you do that in each time, data is copied over to each thread’s heap space. But there are future plans to implement different memory sharing patterns for data sharing across workers without copying. These memory sharing patterns will eventually be the cornerstones of high-performance multi-threading solutions in JavaScript.

To learn more about Napa.js head over to this url.

Following is diagram of Napa.js architecture,

Napa.js architecture

I recently got to know about Napa.js via a friend of mine. I was really excited about multi-threaded programming coming into Node.js. Therefore I gave it a try. In the rest of the article I am going to talk about the simple program I have written and the results I obtained.

Installation

First of all lets see how to install Napa.js. It is as you would install any other npm module.

npm install napajs

However I ran into a error when running my application.

/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.22' not found

You can resolve this error by following the below steps.

sudo add-apt-repository ppa:ubuntu-toolchain-r/test 
sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade

Some people have resolved the error without doing the dist-upgrade, but in my case I had to do it.

Program

Github repo for the program I have written to try out Napa.js can be found here.

Following is the serial version,

Following is the parallel version

In above,

for (const currentTime  = new Date().getTime() + 5; new Date().getTime() < currentTime;);

is used to simulate some CPU bound work.

Number of threads can be configured by changing “NUMBER_OF_WORKERS”.

Results

When using a 100x100 matrix with 100ms CPU bound work per matrix row for the serial version, latency is as follows,

10000ms

Following table shows the results for the parallel version.

Results for the parallel version

Results can be charted as follows,

Speedup against number of workers

Following is the lscpu result of my computer,

Architecture:          x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 61
Model name: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Stepping: 4
CPU MHz: 1551.085
CPU max MHz: 2700.0000
CPU min MHz: 500.0000
BogoMIPS: 4390.19
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3

As we can observe we can get a very good speedup up until 4 workers, which is really cool.

Problems

When you keep increasing the matrix size (say for an example 10000x10000) you will run out of memory pretty soon. The reason for this is, when you share data across each worker data is copied over to each worker’s heap space. This makes Napa.js consume a lot of memory. As I mentioned above, in future, different memory sharing patterns will be used to improve the performance of Napa.js

Conclusion

As we have seen we can obtain really good speedups by using Napa.js for simple parallel algorithms. Napa.js is still young. But we can see a lot of potential in it for the future.

Happy multi-threading with Napa.js!

--

--

Dumindu Buddhika
Code Insights

Full-stack developer — Golang, React, Node, Kubernetes