Node.js, lots of ways to block your event-loop (and how to avoid it)

Vincent Vallet
Aug 3, 2020 · 9 min read

In this article, we will see some ways to quickly block or slow down the Event-loop of Node.js. If you are not familiar with the concept of “Event-loop” in Node.js, I recommend you read some articles first about this subject. The most important thing to remember is: the Event-loop is single-threaded, so if you block it or slow it down then this will impact your entire application.

Event loop quick overview

I will not go into a long explanation about the Event-loop since many have done it before and better than me. For the understanding of this article you just have to remember 2 things :

  • the Event-loop is the heart of Node.js, it can be seen as an abstraction of how Node.js executes code and runs your application
  • it must run without interruption and without slowing down otherwise, your users will quickly become frustrated

The Event-loop can be schematized as follows (thanks to Bert Belder for the diagram):

Event-loop overview

Event loop limitations & dangers

As previously said it’s crucial to have a running event loop and keep its latency as low as possible. The latency is basically the average time separating two successive iterations of your Event-loop.

A potential point of failure in Node.js application can come from two factors:

  1. The Event-loop is single-threaded

Also known as the “one instruction to block them all” factor. Indeed in Node.js you can block every request just because one of them had a blocking instruction. A good review of your code should always start with a distinction between blocking and non-blocking code.

Non-blocking code (simple instructions):

Blocking code (long operation):

Observation 1: do not confuse blocking code and infinite loop, a blocking code is generally a long operation (more than a few milliseconds).

Observation 2: try to differentiate long operations and operations that will slow down or block the Event-loop. Some long operations can be handled asynchronously without disturbing your app (like database access).

Observation 3: long response time of your application does not necessarily mean you have a blocking task (it can be related to long DB access, external API calls, etc).

Difference between, long operations, blocking code, etc

2. Thread pool limit

Node.js tries to always process a blocking operation with async APIs or with a thread pool. In this manner, some blocking operations become non-blocking from your application’s point of view. As much as possible it will use async APIs as it’s a more powerful and lightweight system and it keeps the usage of thread pool when no other choice is possible. Why? Only because a thread has a bigger footprint on your system and consumes more resources.

There are a few cases where Node.js has to use the thread pool:

  • all fs (File system) operations, except fs.FSWatcher()
  • some functions from Crypto lib
  • almost all Zlib functions
  • dns.lookup(), dns.lookupService()

And this thread pool has a size limit, by default, Node.js has access to only 4 threads, so you can parallelize only 4 operations at the same time.

This value can be customized with the variable UV_THREADPOOL_SIZE.

UV_THREADPOOL_SIZE=16 node index.js

In any case, every operation that uses the thread pool behind the scenes is a potential performance bottleneck.

How to slow down the event loop

CPU-intensive operations: crypto

The Node.js crypto lib is known to have a lot of functions that use a lot of CPU. In a real case, it means you can quickly slow down your application. The problem becomes critical when this lib is used in every incoming request. It will:

  • slow down all individual requests (and generate users frustration)
  • generate too many instances to compensate for the increase in CPU consumption

In this example, we generate a token in each request which is probably not useful.

We prefer to generate it only once and then reuse it. In this manner, you don’t slow down the event loop for each new request.

Of course, it’s a simple example but here is the difference in terms of performance:

Before:

After:

We go from 195 requests in 10 seconds to 39,434: no possible comparison!

In a real case, it means you will decrease the number of instances you need to serve the same amount of requests and/or you can use smaller servers to do the same work.

JSON.parse / JSON.stringify

Another interesting point is the famous JSON parser. We commonly use JSON.stringify and JSON.parse functions, but these two methods have a complexity of O(n) where n is the length of your JSON object.

Let’s see the difference when we use JSON.stringify with a small JSON file (~0.4Kb) and a large JSON file (~9Mb).

With a small JSON file
With a big JSON file

We go from 252 requests in 10 seconds to 75k. The solution can be to work with small files only or to load large files only once.

If you really need to work with large JSON objects you should take a look at these solutions:

Read a file instead of memory

As said before, each time you read a file you will potentially create a performance bottleneck, especially if you read a file each time a request is processed. Sometimes this operation is hidden inside a dependency and it’s hard to detect.

I will talk about a concrete example we encountered in one of our projects at Voodoo. We use the MaxMind database to extract the user’s country from the IP address. To do that we simply use an existing npm module. Basically it uses readFile from Node.js core (fsmodule) under the hood. It’s an asynchronous operation, so it should be a piece of cake, right?

But for every new incoming request, we read the DB file (remember we have a limited number of threads for this). So in a high traffic API, it tends to slow down the Event-loop.

Solution: store all the DB in memory during server startup.

The following chart should speak for itself concerning the performance gain.

Average latency after the deployment

Vulnerable regexp

A vulnerable regular expression is one on which your regular expression engine might take exponential time.

Most of the time your regexp complexity will be O(n) (where n is the length of your input) but it some cases it can beO(n^2) and it can lead to REDOS.

Let see a simple regexp to check if an email address is valid.

[a-z]+@[a-z]+([a-z\.]+\.)+[a-z]+

Now we can measure the execution time with a simple email address and with a fake email.

Vulnerable regexp can block the Event-loop

If you add some points at the end of the input it will quickly block your app. In this simple example, we go from 0.05s to 8.4s. And you can add a few more points to completely block your Node.js instance.

To avoid it you can check your regexp with some tools like safe-regex, or you can use solutions that will handle regexp for you like validator.js.

How to block the event loop

Programmatic errors

Of course, the easiest way to block your application is to insert an infinite loop. It seems obvious to detect and to avoid but it’s still possible especially when you work a lot with modules or with events.

Sometimes this kind of behavior is created faster than you might think, even by good programmers. Let see the example with date and while loop.

Still not convinced? What about process.nextTick()?

process.nextTick() & infinite loop

process.nextTick() will invoke a callback at the end of the current operation, before the next event loop tick starts.

process.nextTick() in the Event-loop

It can be used in some cases, but the problem is:

  • it will prevent the event loop to continue its cycle until your callback is finished
  • it allows you to block every I/O by making recursive process.nextTick() calls. It’s not technically an infinite loop but it will produce the same effect, like a bad recursive function without termination condition.

Recursion has something to do with infinity

Sync operations

This is not a surprise, synchronous operations in Node.js are bad practices. If you have read this whole article it should be obvious to you! Every time you use them, you will block your entire application until the operation is finished. Node.js will not be able to use the thread pool or async APIs and the event-loop activity will be suspended.

How to create an infinite event loop (your program will never exit)

Let’s say you want to create a simple program that needs to exit after a simple task is finished, like a worker or a simple script. Those programs are supposed to stop in any case and very quickly. But you can create a situation where the Event-loop will never exit. Do you remember the first diagram? There is a ref which is a simple counter of all pending tasks in the Event-loop. If this ref is greater than 0, then the program will not exit and Node.js will check every pending task. If a task is finished then the ref will be decrement. So you will only be able to exit your program once all the tasks are finished and so if the refis equal to 0.

setInterval

Timers are the best example! If you introduce a simple setInterval inside a script, if you don’t clear this timer, it will run forever, and your program will too.

To avoid this, you can:

  • clear all your timers when they become no longer useful
  • use process.exit() or process.abort() or process.kill()

Event listeners (no problem)

An event listener can be seen as a background task that will go on forever until you clean it. We can assume it will increment the ref counter of the EventLoop and so create a kind of infinite loop. But it’s not the case, even if you forget to remove your handlers.

Even if you don’t block the Event-loop with an EventEmitter it’s always a best practice to clean your listeners. You can use removeListener or removeAllListeners methods.

Monitoring

Modules

Some tools can help you to inspect the Event-loop state and to visualize its behavior:

  • wtfnode is a simple module that generates a “dump” of the Event-loop: https://www.npmjs.com/package/wtfnode
  • you can use the internal methods directlyprocess._getActiveRequests()and process._getActiveHandles() which will give you the raw data about tasks inside your Event-loop.
  • clinicjs can also provide some valuable data

APM

Some APM solutions provide information about Event-loop and its latency. It can be useful to detect an instance in a bad state.

Some of them display information about Garbage Collector which is another key concept to better understand Node.js and to debug your application. If you want to learn more about it you can read my article about GC.

Voodoo Engineering

Learn about Voodoo’s engineering efforts.