Threads in NodeJs — Performance Optimization

Danish Siddiq
tajawal
Published in
7 min readAug 15, 2018
Image Courtesy Google

Before you start reading an article, I advise reading about clustering and event-driven architecture in NodeJs which will readily help you understanding threading much more conveniently.

Recently threading has been introduced in NodeJs on an experimental basis in version 10.5.0. I assume you must also be surprised, hearing of threading in NodeJs since that’s what NodeJs is famous having a single threaded platform but unbeatable performance. I choose to write about this new feature which can really be helpful in near future for especially in machine learning, AI etc since these are areas where NodeJs really needs to work hard to make its space.

Different Programming Methodologies/Concepts:

Before discussing threading, I would like to explain few of the interesting topics in computer programming like parallel, concurrent, and asynchronous programming, and I’m using an answer theme from Quora, simple and explainable.

These guys are juggling in parallel

This guy is concurrently juggling balls

Plane is asynchronously refilling while it is carrying a load for destination

NodeJs Asynchronous does not need any explanation, there are plenty of topics on Asynchronous since this feature is considered as a spine of NodeJs. NodeJs also introduced “Cluster” module to utilize multiple cores of CPU but through child processes. Process and threading are different in nature and in case you missed hearing about the “Cluster” module then follow a link shared at the beginning of an article.

For What, Why and How?

After a little background, focusing entirely on threading. It must be noted that threading is not recommended for I/O operations since NodeJs own implementation is better optimized for filing operations. NodeJs introduced thread to facilitate heavy manipulation and to process a huge chunk of data. In case you want to test an application with thread modules, you need to run an application with an experimental flag.

--experimental-worker 

Just Stop talking and take my money : ), Example:

In the following example, project structure is fairly simple involving 2 Js files, one for the main thread and the second for the worker thread.

index.js for the main thread and workerCode.js for the worker thread

Main thread creates a list of 100 elements ranging from 0 to 99. This list is passed to the worker thread by cloning, along with a random multiple factor also generated in the main thread.

In the worker thread, a list and multiple factor from the main thread are received. Value at each index of a list is multiplied with a factor, value at each index is updated with the result and also transported to the main thread for output on the console. Once all items are multiplied in the worker thread, a list with updated values from the worker thread is sent back to the main thread. This updated list and original list(generated at the beginning of the main thread) will be printed to prove that both lists are independent.

The main thread is also traversing on a list with a specific interval to console value of each index. It was done to show overlapping messages from both threads.

Main thread / index.js code:

Main thread code dissection:

  1. “worker_threads” module is included for creating threads. The worker module provides a way to create multiple environments running on independent threads and to create message channels between them. It can be accessed using the --experimental-worker flag and:
const { Worker } = require('worker_threads');
  1. “mainBody” is requesting to create a thread in an “initiateWorker” method. This is the most important method, which contains a callback method for handling data and error, which will be received later from the worker thread. It calls a “startWorker” method which contains the following code:
let w = new Worker(path, { workerData: lst });

As per documentation, Worker expects the path to the worker’s main script. A path must be either an absolute or a relative. The second parameter is worker data which will be cloned in the worker thread, currently, we are passing a list of data.

4. After defining script file for a thread, events listeners are defined for receiving messages between the main and worker threads. Message, error and exit events are defined to receive messages from the worker thread, also in case of an error or exit. A better understanding of event-driven architecture can be obtained from a link shared in the beginning.

5. To send messages/data from the main thread to the worker thread is fairly easy, you can observe from this code:

myWorker.postMessage({ multipleFactor: getRandomArbitrary(3,9)});

The main thread is sending a random number between 3 and 9 to the worker thread to multiply each item of a list with this randomly generated number.

5. Callback method handles data which will be received from a worker thread. The callback is expecting to receive a result of multiplication of every item of a list. Once all items will be processed then the whole updated list will be expected.

7. In last main body is also calling “processDataInMainThread” with a specific interval to keep processing in the main thread while in parallel to the worker thread execution.

Worker thread / workerCode.js code:

Worker thread code dissection:

  1. parentPort is added from worker_threads to receive and send messages from-to parent port. By observing code in “registerForEventListening”, it is fairly understandable that event is registered to listen on parent port for messages and error along callback method is defined.
  2. Multiple factor from the main thread is expected to be received in “message” event, but where is a list from the main thread. If you observe, a list was already sent to the worker thread in a constructor, Point 2 from the main thread. This list is available in worker thread by adding “workerData”. According to documentation, an arbitrary JavaScript value that contains a clone of the data passed to thread’s Worker constructor.
  3. The callback method is expecting a multiple factor from the main thread and as soon as it will receive a random number from the main thread, it will call “processDataAndSendData” with a specific interval to process all items in a list from the main thread. On every call worker thread sends a result of multiplication to the main thread to receive a result in the following code:
parentPort.postMessage({ index, val: workerData[index], isInProgress:true });
  1. Once all items are processed then a clone list with updated value will be sent to the main thread where eventually in the end, original list formed in the main thread and an updated list from the worker thread will be consoled to make sure that both threads were accessing different lists.

Output:

I filled a list with 10 items so following is a console output with overlapping calls from the main and worker thread. Multiple factor was 6. You can adjust list size and intervals according to your patience level :)

Main and Worker thread having different set of data

Conclusion:

  1. Threading implementation is fairly simple in NodeJs but like other platforms, data need to be synched properly.
  2. Data passed in a constructor for the worker is cloned and independent of the main thread.
  3. Threading widely utilizes event-based architecture for the flow of data, although streams can also be used for this purpose.
  4. Creating worker instances inside of other worker instances are fairly possible.
  5. NodeJs asynchronous methodology is fairly effective, robust and performance wise very cool, finding a reason to use thread really need an architectural explanation.
  6. Further details on this feature are available on NodeJs documentation:

CodeBase:

Hit me up for any suggestions and improvisation,

“The thing about improvisation is that it’s not about what you say. It’s listening to what other people say. It’s about what you hear.” — Paul Merton

--

--