Child Processes: Multitasking in NodeJS

Deep Dive in Child Processes, Spawn, Exec, ExecFile, Fork, IPC

Manik Mudholkar
17 min readJan 7, 2024

This article is the fifth article of my Advanced NodeJS for Senior Engineers Series. In this article, I’m going to explain what, why and how child processes works in detail, and how to get the best performance using child processes. Official documentation present at child processes.
You can find the other articles of the Advanced NodeJS for Senior Engineers series below:

Post Series Roadmap

* The V8 JavaScript Engine
* Async IO in NodeJS
* Event Loop in NodeJS
* Worker Threads : Multitasking in NodeJS
* Child Processes: Multitasking in NodeJS (This Article)
* Clustering and PM2: Multitasking in NodeJS
* Debunking Common NodeJS Misconceptions
Table of Content

* What exactly Child Processes are then?
* Why was it even needed?
* Running External Programs
* Improved Isolation
* Improved Scalability
* Improved Robustness
* Child Process Creation
* Using .spawn() for process creation
* Using .fork() for process creation
* Using exec() for process creation
* Using execFile() for process creation
* Synchronous Process Creation
* When to use what?
* Abort/Stop/Kill the child process
* Handing I/O between the child and parent processes
* Cascading the streams together
* Security with command execution
* Child process to run independently of its parent process
* Make spawn to use the shell syntax & inherit standard IO of parent

What exactly Child Processes are then?

When you run the NodeJS application it’ll have its own process just like any other application you run, may it be VS Code, VLC Player etc. The properties of this process is available on Global object’s process variable that we can access in our Node app code.

NodeJS is a single threaded in nature but there could be a case where we need a multi processes especially for running synchronous, cpu-intensive tasks in isolation. This is where child processes come into play. The node:child_process module allows us to create sub-processes and establish a communication channel, known as Inter Process Communication (IPC), between the main process and the child process.

In addition to handling lengthy tasks, this module has the capability to interact with the operating system and execute shell commands. To put it simply, it enables us to execute not just JavaScript, but also other programming languages like Git, Python, PHP, or any other language.

Why was it even needed?

You might be wondering why we need child processes when we already have worker threads for handling CPU-intensive tasks. After all, worker threads have their own heap, V8 instance, and event loop. However, there are certain cases where having separate sub-processes is more desirable than threads within the same process. Let me explain why:

Running external programs

Child processes allow you to run external programs or scripts as separate processes. This is particularly useful when you need to interact with other executables.

Improved Isolation

Unlike worker threads, child processes provide a separate instance of the entire Node.js runtime. Each child process has its own memory space and communicates with the main process through IPC (Inter-Process Communication). This level of isolation is beneficial for tasks that may have resource conflicts or dependencies that need to be separated.

Improved Scalability

Child processes distribute tasks among multiple processes, enabling you to leverage the power of multi-core systems. This allows you to handle more concurrent requests and improve the overall scalability of your application.

Improved Robustness

If a child process crashes for any reason, it will not bring down your main process along with it. This ensures that your application remains stable and resilient even in the face of failures.

So, while worker threads are great for certain scenarios, child processes offer distinct advantages in terms of running external programs, providing isolation, enhancing scalability, and ensuring robustness.

Child Process Creation

The child_process module enables us to access Operating System functionalities by running any system command inside a, well, child process. These child-processes can be created both synchronously and asynchronously.
const { spawn, fork, exec, execFile } = require(‘child_process’);

The child_process.spawn(), child_process.fork(), child_process.exec(), and child_process.execFile() are the methods methods which support the creation of sub-processes asynchronously.

Each of the methods returns a ChildProcess instance. These objects implement the Node.js EventEmitter API, allowing the parent process to register listener functions that are called when certain events occur during the life cycle of the child process. Such as followings

  • The 'disconnect' event is emitted after calling the subprocess.disconnect() method in parent process or process.disconnect() in child process.
  • The error event is emitted if the process could not be spawned or killed or sending a message to the child process failed or ending a message to the child process failed.
  • The close event is emitted when the stdio streams of a child process get closed. This is distinct from the 'exit' event, since multiple processes might share the same stdio streams. The 'close' event will always emit after 'exit' was already emitted, or 'error' if the child failed to spawn.
  • The 'exit' event is emitted after the child process ends.
  • The message event is the most important one. It’s emitted when the child process uses the process.send() function to send messages. This is how parent/child processes can communicate with each other.
  • The 'spawn' event is emitted once the child process has spawned successfully. If the child process does not spawn successfully, the 'spawn' event is not emitted and the 'error' event is emitted instead.

Using .spawn() for process creation

.spawn() method can be used to create a child process where we pass the command which we want to run, the arguments that we want to pass to that command in array of strings format and lastly an options object where we have a ability to override few of the settings with which the process gets created such as env i.e. environment variables, shell i.e. weather to run command inside a shell, detached i.e. weather child process to continue running after the parent exits, signal which can be used for aborting the child process and many more. You can check those options in official docs of spawn.

What makes .spawn() method different from other process creation method is that it spawns an external application in a new process and returns a streaming interface for I/O. Because of this it’s great for handling applications that produce large amounts of data or for working with data as it reads in. Stream based I/O can offer following advantages:

  • Low memory footprint.
  • Automatically handle back-pressure.
  • Lazily produce or consume data in buffered chunks..
  • Evented and non-blocking.
  • Buffers allow you to work around the V8 heap memory limit.

Every child process also gets the three standard stdio streams, which we can access using child.stderr, child.stdout which are readable streams, and child.stdin which is writable stream. These streams are event emitters, we can listen to different events on those stdio streams that are attached to every child process. For child.stdout, and child.stderr , we can listen to the data event, which will have the output of the command or any error encountered while executing the command.

Example of running ls -lh /usr, capturing stdout, stderr, and the exit code Try this example in Linux/Unix System:

const { spawn } = require('node:child_process');
const ls = spawn('ls', ['-lh', '/usr']);

ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});

ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});

ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
Output

Let’s take it up a notch with a complex example, here we will try to run ps | grep bash , ps command returns the ongoing processes and grep is a useful command to search for matching pattern here we will try to search for ‘bash’. One process will be spawned for ps whose output stream i.e. ps.stdout we will try to write into to input stream of grep i.e. grep.stdin.write . Once ps is will be finish it’ll call the close at when we will end the input stream of grep and grep command will get execute. below would be written inside index.js

const { spawn } = require('node:child_process')
const ps = spawn('ps')
const grep = spawn('grep', ['bash'])

ps.stdout.on('data', (data) => {
grep.stdin.write(data)
})

ps.stderr.on('data', (data) => {
console.error(`ps stderr: ${data}`)
})

ps.on('close', (code) => {
if (code !== 0) {
console.log(`ps process exited with code ${code}`)
}
grep.stdin.end()
})

grep.stdout.on('data', (data) => {
console.log(data.toString())
})

grep.stderr.on('data', (data) => {
console.error(`grep stderr: ${data}`)
})

grep.on('close', (code) => {
if (code !== 0) {
console.log(`grep process exited with code ${code}`)
}
})
output

Lets see the one more example where we spawn will fail to execute the command.

const { spawn } = require('node:child_process')
const subprocess = spawn('bad_command')

subprocess.on('error', (err) => {
console.error('Failed to start subprocess.')
})
output

When running on Windows, .bat and .cmd files can be invoked using child_process.spawn() with the shell option set, with child_process.exec(), or by spawning cmd.exe and passing the .bat or .cmd file as an argument (which is what the shell option and child_process.exec() do). In any case, if the script filename contains spaces it needs to be quoted.

Using .fork() for process creation

.fork() is particularly useful when executing a Node.js script in a new process and want an IPC communication channel between the two processes. The child_process.fork() method is a special case of child_process.spawn() used specifically to spawn new Node.js processes. Like child_process.spawn(), a ChildProcess object is returned. The returned ChildProcess will have an additional communication channel built-in that allows messages to be passed back and forth between the parent and child.

The fork method will open an IPC channel allowing message passing between Node processes:

  • On the child process, process.on(‘message’) and process.send(‘message to parent’) can be used to receive and send data
  • On the parent process, child.on(‘message’) and child.send(‘message to child’) are used

Let take a look at simple example, in index.js

const { fork } = require('child_process');

const forked = fork('child_program.js');

forked.on('message', (msg) => {
console.log('Message from child', msg);
});

forked.send('hello world');

in child_program.js

process.on('message', (msg) => {
console.log('Message from parent:', msg);
});

let counter = 0;

setInterval(() => {
process.send({ counter: counter++ });
}, 1000);
output

To pass down messages from the parent to the child, we can execute the send function on the forked object itself, and then, in the child script, we can listen to the message event on the global process object.

When executing the parent.js file above, it’ll first send down the 'hello world' to be printed by the forked child process and then the forked child process will send an incremented counter value every second to be printed by the parent process.

Lets take an example of something more practical, the example below spawns two children that each handle connections with “normal” or “special” priority:

in index.js

const { fork } = require('node:child_process');
const normal = fork('child_program.js', ['normal']);
const special = fork('schild_program.js', ['special']);

// Open up the server and send sockets to child. Use pauseOnConnect to prevent
// the sockets from being read before they are sent to the child process.
const server = require('node:net').createServer({ pauseOnConnect: true });
server.on('connection', (socket) => {

// If this is special priority...
if (socket.remoteAddress === '74.125.127.100') {
special.send('socket', socket);
return;
}
// This is normal priority.
normal.send('socket', socket);
});
server.listen(1337);

in child_program.js

process.on('message', (m, socket) => {
if (m === 'socket') {
if (socket) {
// Check that the client socket exists.
// It is possible for the socket to be closed between the time it is
// sent and the time it is received in the child process.
socket.end(`Request handled with ${process.argv[2]} priority`);
}
}
});

Above example according to remoteAddress the socket is being passed on to respective child process i.e. if special remoteAddress then to special subprocess else normal subprocess. Do not use .maxConnections on a socket that has been passed to a subprocess. The parent cannot track when the socket is destroyed. Any 'message' handlers in the subprocess should verify that socket exists, as the connection may have been closed during the time it takes to send the connection to the child.

Using exec() for process creation

The exec function is a good choice if you need to use the shell syntax and if the size of the data expected from the command is small. It buffers the command’s generated output and passes the whole output value to a callback function (instead of using streams, which is what spawn does).

exec pawns a shell then executes the command within that shell. If a callback function is provided, it is called with the arguments (error, stdout, stderr). On success, error will be null. On error, error will be an instance of Error. The error.code property will be the exit code of the process. The stdout and stderr arguments passed to the callback will contain the stdout and stderr output of the child process.

Lets take an simple example where we will read the index.js by cat command and by wc -l we will count the line of the result i.e. line of code.

const { exec } = require('node:child_process')
exec('cat index.js | wc -l', (error, stdout, stderr) => {
if (error) {
console.error(`exec error: ${error}`)
return
}
console.log(`stdout: ${stdout}`)
console.error(`stderr: ${stderr}`)
})
output

Interesting twist we can add to this exec is by providing few settings from Options object for example we can use cwd option to change the working directory of the script. For example, above example can be made to run in different directory by

Since the exec function uses a shell to execute the command, we can use the shell syntax directly here making use of the shell pipe feature.

Using execFile() for process creation

If you need to execute a file without using a shell, the execFile function is what you need. It behaves exactly like the exec function, but does not use a shell, which makes it a bit more efficient.

const { execFile } = require('node:child_process');
const child = execFile('node', ['--version'], (error, stdout, stderr) => {
if (error) {
throw error;
}
console.log(stdout);
});
output

On Windows, some files cannot be executed on their own, like .bat or .cmd files. Those files cannot be executed with execFile and either exec or spawn with shell set to true is required to execute them.

Synchronous Process Creation

The .spawnSync, .execSync, and .execFileSync methods are synchronous and will block the Node.js event loop, pausing execution of any additional code until the spawned process exits.

Blocking calls like these are mostly useful for simplifying general-purpose scripting tasks and for simplifying the loading/processing of application configuration at startup.

When to use what?

Abort/Stop/Kill the child process

There are few ways to terminate the child process.

  • By using .kill() on the ChildProcess object.
  • By timeout option from options object. we need to set milliseconds the maximum amount of time the process is allowed to run. Default: undefined
  • By using signal, If the signal option is enabled, calling .abort() on the corresponding AbortController is similar to calling .kill() on the child process except the error passed to the callback will be an
const { spawn } = require('node:child_process');
const controller = new AbortController();
const { signal } = controller;
const grep = spawn('grep', ['ssh'], { signal });
grep.on('error', (err) => {
// This will be called with err being an AbortError if the controller aborts
});
controller.abort(); // Stops the child process

Handing I/O between the child and parent processes

The stdio option is responsible for determining the destination of input/output from a child process. It can be assigned either an array or a string. The string values serve as convenient shortcuts that will automatically translate into commonly used array configurations.

By default, stdio is configured as
stdio: ‘pipe’
which is a shorthand for the following array values:
stdio: [ ‘pipe’, ‘pipe’, ‘pipe’ ]

This implies that the ChildProcess object will have streams (child.stdio[0], child.stdio[1], child.stdio[2]) that provide access to file descriptors 0–2.

If we want to direct the I/O elsewhere, we have the option to assign a file descriptor. On the other hand, if we wish to completely discard it, we can simply use the ‘ignore’.

Let illustrate this with an example, lets say we want to spawn a child process where we want to ignore the FD 0 (stdin) since we won’t be providing any input to the child process . But want to capture the output i.e. FD 0 (stdout) and input FD 0 (stderr) in separate log file. It would look some what like

let fs = require('fs')
let cp = require('child_process')

let outFd = fs.openSync('./outputlogs', 'a')
let errFd = fs.openSync('./errorslogs', 'a')
let child = cp.spawn('ls', [], {
stdio: ['ignore', outFd, errFd]
})
outputlogs after running

Cascading the streams together

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Let’s try to write a program such a way that one process’s output should be fed to next process and the next. cat command should read the data from a fiile, then this data should be fed to sort command’s input and provide the lines sorted as output, this again should be fed to uniq commands input which would remove duplicated lines.

in filesToBeChecked.txt

LOL
LMAO
ROLF
LOL
GTG

in index.js

let cp = require('child_process')
let cat = cp.spawn('cat', ['filesToBeChecked.txt'])
let sort = cp.spawn('sort')
let uniq = cp.spawn('uniq')
cat.stdout.pipe(sort.stdin)
sort.stdout.pipe(uniq.stdin)
uniq.stdout.pipe(process.stdout)

Here output of each command becomes input for next command.

Security with command execution

We need to be careful when we are allowing our child processes to access to the shell. It’s important to be aware that utilizing shell syntax can pose a security risk, especially when dealing with dynamic input from external sources. This leaves room for potential command injection attacks, where a user can exploit shell syntax characters such as ‘;’ and ‘$’ to execute malicious commands. For instance, they could input a command like command + ’; rm -rf ~’ to delete important files.

Lets take an example (DO NOT PERFORM THIS ON YOUR SYSTEM)

Lets say you have a process where you take user input for a command with help of the input and run that command with exec . So your command looks somewhat like this where
cp.exec('something hardcoded command' + req.query.userInput);
If malicious user provides “; rm -rf / ;
In case you haven’t figured it out yet, this message means to “Initiate a fresh command (;), forcefully and thoroughly delete all files and directories at the core of the file system (rm -rf /), and conclude the command (;) if there is anything that comes after it.

If you’re looking to run an application without requiring shell facilities, it’s actually safer (and a tad quicker) to utilize execFile instead.
cp.execFile(‘something hardcoded command’, [req.query.schema]);

Here this malicious injection attack would fail since it’s not run in a shell and the external application likely wouldn’t understand the argument and would raise an error.

Child process to run independently of its parent process

Few things to be kept in mind, which are as follows:
- By default, the parent will wait for the detached child to exit.
- There are few things which ties parent process with node which are the reference i.e. ref of child process in parent process and the communication channel which formed between parent and child.

To make child run independently we have few things to work with

  • If we want parent to continue even after child process exits, we can use one of the setting on option object which is option.detached .
    On Windows, Setting options.detached to true makes it possible for the child process to continue running after the parent exits. Once enabled cannot be disabled again.
    On non-Windows platforms, if options.detached is set to true, the child process will be made the leader of a new process group and session. Child processes may continue running after the parent exits regardless of whether they are detached or not.
  • Reference of child process in parent’s event loop, keeps the parent from exiting. To remove this reference we can call .unref() on that childprocess. (similerly we can add reference back by calling .ref())
  • The options.stdio represents the channel between parent and child . options.stdio option is used to configure the pipes that are established between the parent and child process. Setting this option to ‘ignore’ will instructs to ignore this communication channel. Check for more info at Official docs.

Example of a long-running process, by detaching and also ignoring its parent stdio file descriptors, in order to ignore the parent's termination:

const { spawn } = require('node:child_process');

const subprocess = spawn(process.argv[0], ['child_program.js'], {
detached: true,
stdio: 'ignore',
});

subprocess.unref();

Lets take example of something more complex. options.stdio allows us to define what the streams should be e.g.
If we want to pass pipe as input stream, file descripter as output stream, and pass current main process’s error stream as the error stream. Then this option would look like ['pipe', fd, process.stderr]
If we want to ignore all std streams we just need to pass ‘ignore’ like we did in previous example. passing ignore is equivalent to passing [‘ignore’, ‘ignore’, ‘ignore’]. Just like ignore there are other options like pipe, inherit, overlapped, ipc, null, undefined. Read more in official docs.

Let try to illustrate this by passing file descriptor of a file as a output stream to child process by which child will be able to write the output to the given file.
in index.js

const fs = require('node:fs')
const { spawn } = require('node:child_process')
const out = fs.openSync('./out.log', 'a')

const subprocess = spawn('node', ['child_program.js'], {
detached: true,
stdio: ['ignore', out, process.stderr]
})

subprocess.unref()

in child_program.js

const { spawn } = require('node:child_process')
const ls = spawn('ls', ['-lh', '/usr'])

ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`)
})

ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`)
})

ls.on('close', (code) => {
console.log(`child process exited with code ${code}`)
})
output

Just for illustration above example can be written with fork which will create same result.

const fs = require('node:fs')
const { fork } = require('node:child_process')
const out = fs.openSync('./out.log', 'a')

const subprocess = fork('child_program.js', [], {
detached: true,
stdio: ['ipc', out, process.stderr]
})

subprocess.unref()

Make spawn to use the shell syntax & inherit standard IO of parent

We can make the spawned child process inherit the standard IO objects of its parents if we want to, but also, more importantly, we can make the spawn function use the shell syntax as well.

Lets take example of below

in child_program.js

const { spawn } = require('node:child_process')
const ls = spawn('ls', ['-lh', '/usr'])
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`)
})
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`)
})
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`)
})

without using stdio: 'inherit' & shell: true

in index.js

const { spawn } = require('node:child_process')
const ps = spawn('node child_program.js', {

})
output

it errored out cause spawn is not able understand the shell syntax.

Lets try adding shell option

in index.js

const { spawn } = require('node:child_process')
const ps = spawn('node child_program.js', {
shell: true
})
output

Now well few things can be said, now spawn is able to understand the shell syntax and it is running the child_program but we are not able to see the output as the terminal/console that we are looking at is connected to main process’s standard IO streams and not the subprocess’s. So in order to make child process to output the result on main process’s terminal we need to share the main IO stream with child process. We can do this by using stdio: 'inherit' option.

Lets try adding stdio: 'inherit'option

const { spawn } = require('node:child_process')
const ps = spawn('node child_program.js', {
stdio: 'inherit',
shell: true
})
output

Because of the stdio: 'inherit' option , when we execute the code, the child process inherits the main process stdin, stdout, and stderr. This causes the child process data events handlers to be triggered on the main process.stdout stream, making the script output the result right away.

Because of the shell: true option above, we were able to use the shell syntax in the passed command, just like we did with exec. But with this code, we still get the advantage of the streaming of data that the spawn function gives us.

Before you go!

  • Stay tuned for more insights! Follow and subscribe.
  • Did you see what happens when you click and hold the clap 👏 button?

--

--