Why Node.js is unfit for industry standard cloud applications dealing at scale?

Published in

TechStreet

5 min readApr 2, 2019

Hello! Since you are interested in reading this post, I can say that you should belong to one of the following two categories.

1. A die hard Node.js fan who is pissed off about the prevailing bad stuff about Node.js.

2. Someone who is curious to learn about this new stack.

Well if you are neither of above but still the technology gives you thrills than yes! you are at the right door.

In this article I am assuming that you have a pretty good understanding of software engineering aspects of any one programming language.

Without any delay let’s understand some core features of Node.js: -

Node.js is a JavaScript runtime built on Chrome’s V8 JavaScript engine which is really famous due to its following features: -

1. Ease to deploy

2. Performs I/O operations really fast due to its single threaded event loop.

3. Improves productivity and decrease cost because JavaScript developer now can write code from frontend to backend.

4. Ease of coding

5. Dynamic Node Package Manager (NPM)

6. Rise of Adaptability in Platform as a Service in cloud service providers

7. Good community support

Now let’s see how it will fit when we are in the game of serving millions of end users at a really high pace in micro services environment where the services are being divided based on the domain or business entity of the organization.

In order to work at scale each service must have to launch several instances of its type which eventually arises the needs of the container along with the various load balancer and the service discovery frameworks like consul which we will eventually cover in the upcoming article while designing the scalable resilient systems.

After setting up the container we need to get it registered with the load balancer and also each container has to expose its health end point so that the load balancer can periodically check the health of the corresponding container and act accordingly as per your failover or fail back mechanism and in most of the cases the strategy would be to restart the unhealthy container.

Now we need to devise the strategy where we need to define the thresh hold time to which the load balancer can waits for the container to send the response for its health end point.

Thus, in one of my product we architect our service in an aforementioned way and reason for opting Node.js was because the service was only doing the I/O operations at that time and thus we thought of it as a good fit for the product.

But as most of the products works in an agile environment where we usually have daily checks in to production by leveraging the CI/CD pipelines to make it more efficient and in this way the requirement kept on changing and thus happened with our product also which leads to write some logic at node side to traverse through the records and perform some operation to all the records from the database at runtime.

Now the service was performing really good with medium scale data but suddenly all the containers started showing the unexpected behavior as they were getting restarted infinitely which leads to the outage of the service.

After in detailed diagnosis we examined that the health end point itself was not able to serve the request which result in popping the un healthy container notification at load balancer side which result in restarting of the container and thus the service is responding with the http status code 504 to the end user.

The reason being is the event loop of Node.js what allows it to perform non-blocking I/O operations by offloading its work to the multi-threaded kernel of underlying operating system which actually means the threads at kernel side handles multiple operations simultaneously and whenever the task assigned to the thread gets over it will tell the Node.Js to eventually execute the callback for the processed operation.

In simpler form event loop has phases where each phase has a FIFO queue of callbacks to execute.

Hence the request of the health end point is in the call back queue which was supposed to be executed by the main thread of Node js but unfortunately it was busy in the traversing of all the records.

Now due to limitation of the node js we decided to move that logic to the different service with Go-lang as a stack which itself is not a good approach to have a different service for the corresponding logic of the application as the existing stack should be able to adapt the changes within a fast-going agile environment.

Not only this if we kept on increasing the instances or the containers to serve the increased load the CPU utilization or the memory of each container will never be exceeding more than 70% because of Node.Js single thread event loop which itself is not the good use case to spin up more containers if the utilization is below 70–80% which in turn increase the cost at infrastructure as well as at maintainability side.

Hence after this detailed analysis we realized that the Node.Js can only be a perfect fit for the use case where we want to redirect the request or can act as a reverse proxy for engaging the user requests but not for the application logic especially when you are dealing with scale by serving millions of requests per second.

The whole idea is to leverage the programming language to what is meant to be , as if we are introducing the threads or forking the child process we are itself changing the nature of the language because we all know that how complex it is to handle the concurrency , locks and synchronized in order to enable threads and processes and also one the other hand process is quite costly and allows to execute only one task at a time also in turn it will adds the complexity to manage the process or the child in a similar way which we usually do to manage the connection for the database by implementing some kind of connection pool along with its ttl (Time To Live)and we know how painful it is.

Also we do have some libraries like “worker-farm” who can solve this problem at some extent but as we know that the language like Java is way mature in solving multithreading problems which itself a good and fair decision to go with the them instead of reinventing the wheel by applying the core engineering principles which in-turn adds the cost and resource to the growing startup or even the stable companies but nevertheless to say we as an engineer really like to take it as an challenge which itself to satisfying our anxiety to solve the challenging problems but unfortunately its does not add value to the organization . Although Node.Js really shines as mentioned in the article when we use it as a reverse proxy or also it saves cost for the company as they do not have to hire the backend developers separately.

Happy to hear any comments or view for the same…!!!

Why Node.js is unfit for industry standard cloud applications dealing at scale?

Written by saurabh bhatia