Clustering in NodeJs — Performance Optimization — Part I

“Cluster” module is comprehensively covered to define execution strategy of this native module. Industry implements “Clustering” in live applications via process managers, covered in Part-II

Danish Siddiq
tajawal
4 min readSep 22, 2018

--

The programmer only needs to worry about code

This article is in series to application performance in NodeJs. I was initially skeptical about NodeJs single-threaded behavior but it keeps surprising like mother nature.

Cluster Module — Cores Engagement:

NodeJs single-threaded nature is by default using a single core of a processor. Therefore NodeJs introduced a cluster module to spawn processes. “Cluster” was introduced to scale an application execution on multiple processor cores by creating worker processes. Worker processes share a single port, therefore, requests are routed through a single port.

Process Manager:

Clustering is a native module in NodeJs but process managers are available. If you are looking for process manager’s implementation then follow an article mentioned below but I advise to understand clustering since process managers are doing the same thing under the hood.

Master communication with workers:

Communication between worker processes and master happens through IPC (Inter-process communication). The previous article from this series will give an idea about inter-thread messaging which is alike IPC.

Pros:

  1. In case any spawned process dies in an unplanned or planned manner then a new process can be started instantly as a replacement to killed one without any delay or manual interruption.
  2. All available cores can be utilized for application execution, increases application performance.
  3. Resources wastage reduced tremendously by using the maximum capacity of the processor.
  4. Easy to implement since all work is managed by NodeJs module and no need to introduce an additional dependency.

Cons:

  1. Session management is not available, alternatives are managed by a developer at the cost of complexity.
  2. IPC is a tedious job to manage an application, it is not preferred practice for handling complex applications.

Example:

Github link to an example is shared at the end of an article. The express framework is only used in an example to show that the cluster module does not bother to have the framework in an application. In fact, it works pretty well and independent of the framework.

A project contains few files related to babel(worthy to read about it) and defining a route. Important files defining concept are app.js where the server is set up along with worker processes and workerCode.js contains a method on a defined route populating a million items in an array.

app.js:

Code Dissection:

Covering a few points from code comments,

  1. setupWorkerProcesses method is forking one processes for each available core on the system.
  2. Process reference is saved into workers array to receive a message from a process.
  3. In case any process dies either accidentally or killed by the user, then on receiving exit indication, a new process is forked. This approach shows how to avoid downtime of a server.
  4. “Express” server is set it up in a normal manner as it is done in any other application. The only difference that it will be called for all cores and port will be shared for all processes.

workerCode.js:

Code Dissection:

  1. Once an array loads with million elements, then shoot a message to master that work is done along with process id.

Performance Metrics:

Apache benchmark is used for measuring application performance, helpful in load testing. The example tested using two approaches, setting up a server without clustering approach and second with a cluster. Results were astonishing and performance jumped by 66%. Look out the statistics:

Time per request in milliseconds

Conclusion:

A picture is worth a thousand words. As the number of concurrent requests increased, time per request is jumping exponentially in a single-threaded nature while in a cluster approach, the result is promising.

“Alone we can do so little; together we can do so much.” — Helen Keller

Source Code:

Part II:

Clustering is a native module in NodeJs but process managers are also available with ease of implementation and rich features. Next article is covering process manager implementation.

--

--