Node.js Behind the Scenes

Daniel Wagener
Nov 4 · 6 min read

These are my notes for Jonas Schmedtmann’s Node.js Bootcamp on Udemy: https://www.udemy.com/course/nodejs-express-mongodb-bootcamp/

Notice: I went through this course using an Ubuntu terminal.

Dependencies

The Node runtime has several dependencies. One is Google’s V8 engine, which compiles Javascript into machine code. The other is libuv (“unicorn velociraptor library,” pronounced “lihb-U-V”), which gives Node access to the computer’s file system, operating system, networking, and more. libuv also handles the event loop and thread pool (more on those later). V8 was written in both C++ and Javascript, and libuv was written in C++. Node.js, gives us the ability to use these libraries’ functions by writing pure Javascript.

Processes and Threads

Node.js application run on a single thread. We can think of a thread as being a series of instructions. When a Node.js app initializes, modules are required, the top-level code is executed, and then callbacks are registered. Then, the event loop starts. The event loop performs most of the app’s actions. But, some tasks are too heavy for the event loop. These tasks may include file system APIs, encryption, DNS lookups, or compression. In this case, the event loop can offload the task to libuv’s thread pool, which gives us a default of four but up to 128 additional threads.

The Node.js Event Loop

Any code inside callback functions runs inside the event loop. Things like HTTP requests and file reads emit events when they are finished, and the event loop receives these events and runs the necessary callback functions.

When we start our application, the event loop begins running immediately. Each phase of the event loop has its own callback queue. Some of these phases are more important than others, so we’ll only discuss a few for now. The first of these handles expired timer callbacks (e.g. setTimeout()). The next phase handles I/O polling, i.e. looking for I/O processes that need to be executed and putting them in the phase’s callback queue. Since I/O encompasses networking and file access, about 99% of the code we write is handled by this phase. The next phase handles setImmediate() callbacks, which are special timers that execute after the I/O polling phase. The final phase (for our purposes) handles close callbacks, e.g. when a web server shuts down.

Two other queues follow their own pattern: the process.nextTick() queue and the other microtasks queue (which includes promises). Should they have anything in their callback queues, they’ll execute after the completion of any of the four phases described above. Just to note: the process.nextTick() queue is similar to the setImmediate() queue, with the exception that it’s not limited to running only after the I/O polling phase. It’s a pretty advanced use case and not usually necessary.

When the event loop has run through all four phases, it needs to check whether to run the loop again or simply exit the program. It does so by checking if there are any I/O process or timers running in the background. If there are, the event loop continues on to its next tick (i.e. loops back around to the beginning and runs again).

Don’t Block the Thread!

Were we working with PHP, we’d have multiple threads at our disposal and wouldn’t have to worry so much about blocking. However, because we only have one thread in Node.js, we as developers must take care not to block it. Here are some guidelines:

  • Don’t use the sync versions of fs, crypto, or zlib module functions inside callback functions.
  • Don’t perform complex calculations (e.g. loops within loops).
  • Be careful with JSON in large objects.
  • Don’t use complex regular expressions (e.g. nested quantifiers).

Event-Driven Architecture

Node has objects called event emitters. They emit events when a file is read, when a timer finishes, or when a network request is made, among other things. We as developers set up event listeners that fire off callback functions.

In this case, the server is the event emitter. The on() method creates a listener for our server, and in this case listens for a request event. That event will trigger the callback function. Here, server has access to the on() method because it is part of Node’s EventEmitter class. The above code is an example of the observer pattern, in which a listener constantly waits for event from an emitter. The alternative to this pattern would be functions calling other functions, which would get messy quickly since we’d have functions from the fs module calling functions from the http module, etc.

We can create our own event emitters by creating an instance of EventEmitter and emitting an event we name ourselves:

We can also pass arguments to the emitter:

In a real-world scenario, it’s best to have our emitter extend the EventEmitter we import:

Streams

When we watch Youtube or Netflix, the video file is sent to us piece by piece so we can start watching the video without downloading the whole thing. Node also uses streams. In fact, they are instances of the EventEmitter class, so they can emit and listen to events. There are a few types of streams, but the most important are readable and writable streams.

Readable streams include HTTP requests and fs reads. Important events associated with readable streams include data, when a piece of data has been read, and end, when there is no more data left to read. Writable streams , on the other hand, include HTTP responses and fs writes.

A couple less-common streams are duplex streams and transform streams. Duplex streams are both readable and writable at the same time, essentially a communication channel between client and server that stays open at both times. One example is a net web socket. Transform streams are like duplex streams, but can modify data. A good example is zlib Gzip creation.

A standard file read without streams might look like this:

Node has to load the entire file into memory (as the data variable) before sending it, which is a problem if the file is super huge. To improve page performance, we can instead stream the file:

We use the fs module to create a stream, and then listen for the data event. Then, we create a writable stream with res.write() with chunks of data as its parameter. Then, we listen for the end event so we can call res.end().

Usually, our stream can read the file much faster than the response stream can send it, and the response stream gets overwhelmed. This issue is known as “back pressure.” To fix back pressure, we use the readable stream’s pipe() operator, which “pipes” the output from a readable stream directly into a writable stream:

Thankfully, this best solution is also the easiest to write.

Requiring Modules

Node.js uses the require() syntax of the CommonJS module system. Front-end libraries like React use the ES module system with the keywords import and export, but for the moment those only work in the browser. When we require a module, Node first looks to its core modules (e.g. http). If the module is not a core module but begins with a ./ or a ../, then it’s a developer module and Node will look in our own file system for it. If neither of these cases apply, Node will look at npm and load the module from there. This process is called “resolving and loading.”

Next, Node wraps the module code:

This step keeps the module’s variables in a private scope while giving us access to useful variables like __dirname. A quick run down of these variables:

  • exports: a refernce to module.exports, used to export objects from the module
  • require: a function to require modules
  • module: a reference to the current module
  • __filename: absolute path of the current module’s file
  • __dirname: directory name of the current module

The next step is simply Node executing the module’s code. After that, the require() function returns module.exports. Finally, the module is cached, should we need to require the same module again.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade