A Brief History of Node Streams pt.1

Jessica Quynh
Jan 25, 2017 · 8 min read

Introduction

From spew streams to suck streams, Streams are a little understood interface used in almost every internal module of Node.js and across thousands of NPM packages.

How exactly have streams come to exist? How do they vary from version to version of node? This post takes a look at what streams are and do, while providing some examples along the way.

UNIX Background

The Streams interface in Node.js are an analogous implementation of the pipe interface found on UNIX systems.

So, what does that mean ?

We can think about a pipeline as the movement of information between two points in space.

There is the output of data from one process and that gets into the input of another process.

This diagram describes the nature of a pipe in UNIX systems:

Image for post
Image for post
The Linux Programming Interface, ©2010 Michael Kerrisk. While this diagram depicts a pipeline to be unidirectional, in Node.js, Streams can also be bidirectional.

and are standard streams in computer programming. They are simply communication channels. The former denotes and the latter,

In a common UNIX bash shell, you might write a command like this to display the given files in the working directory.

$ ls# To be displayed in the shell output.
file1.js file2.txt file3.txt

In this instance, what you, the user, type is considered stdin, and the list of files being displayed is stdout.

You might decide to that data to a file.

# This creates a new text file which will contain the list of files.
$ ls >> this_directory_files.txt

Or it through a filter first:

# This will list only files ending with a ".js" extension.
$ ls | grep *.js >> this_directory_files.txt

This is an example of how pipes can be used to manipulate data from one endpoint to another in a concise manner.

The way UNIX pipes are used to transfer data from process to process is exactly how are Streams used in Node.js. That is why they are so often used by developers and within the Node internal codebase itself.

Since , , , all implement streams, it would be easy to imagine the enormous potential use-cases to transfer and manipulate data using Node.js.

What makes streams so powerful, though, is their affinity to itself. That means, one constructed stream from one module will easily link to another stream from a completely different module.

Node.js is built with the in mindShould you be unfamiliar, one of the most important takeaways is this:

Do One Thing and Do It Well

In following this principle, lightweight binaries and modules will be created to absolutely succeed in executing one simple task. With the connective properties of pipes (and analogically, streams) these several modules will be able to link up and create a complex system to execute complicated tasks.

In Node.js, this congruency fosters an entire ecosystem of streaming on a global community scale. An example of this is found in build tools like wheredevelopers often build and share plugins to introduce custom data manipulation in the application build!

Node Streams

To fully understand how Node.js streams are related to UNIX’s pipes, consider Node’s and commands. These are the most direct implementation of the UNIX standard streams

                Streams Hierarchy in Node.js 
=================================================================
EventEmitters
|
Stream (Base class)
/ \
Readable Writable
/ \
process.stdin || stdin process.stdout || stdout
.
.
.

Due to this inheritance, the is said to be similar to , while the is similar to .

Here’s an example on how you can use Node.js’ mock of and to write a javascript bash shell script.

First, make a file that will contain your javascript script commands.

$ touch your_script.js

Go ahead and make that file executable:

$ chmod u+x your_script.js

This will be the code to put in the file. The first line is a Node.js this tells bash to interpret the following code using Node.js.

#!/usr/bin/env nodeprocess.stdin.setEncoding('utf8');process.stdin.on('readable', () => {
var chunk = process.stdin.read();
if (chunk !== null) {
process.stdout.write(`${chunk}`);
}
});

To run the script, write this into your command line:

$ ls | ./your_script.js

Voila! You have your first Node.js bash script.

Though this example is a verbose use-case to implement what is already native to UNIX bash, I hope it might inspire ideas on how you could write bash scripts in javascript.

You could take this one step further and refactor the above code to look like this:

#!/usr/bin/env nodeprocess.stdin.pipe(process.stdout);

Better yet, this is just one small dimension of streams that Node.js provides to help expand the potential and power of javascript. In any case, should the modules not suit your needs, you can extend the stream API and build your own custom stream.

Following the UNIX philosophy, Streams are to be easy-to-use and require little knowledge of the patterns and structures. But in order to really understand streams, it is important to ask these questions:

  • What happens when the source is a large file that contains hundreds of thousands bits of data?
  • What ensures the integrity of the data?
  • What would happen if a process’ resources were used up before the entire file was sent?

Chunking & Buffering

Streams use internal tools and patterns to break up data into manageable pieces to send them from one process to another. This is a process known as

Chunking helps to abstract complex, larger globs of data into smaller parts, which are easier to transfer. The way chunking works is inherent on how streams receive data and that workload is shifted onto

Node.js’ Buffer class is designed after a generic . In the simplest terms, node’s implementation of buffers convert data into a fixed array of integers (determined by the encoding you’ve set, where is the default). These integers represent bytes and each buffer is connected to a memory space in the V8 memory heap.

Transferring binary data instead of strings ensures safe transportation, API universality, and speed. In instances where memory resources are used up, a system is called.

Back-pressure

Back-pressure describes the facilitation of the of data and, precisely, the method that streams use to handle an influx of data that it has no room left for.

From :

Below I’ve provided a visual example:

If we take a look at our friend Pacman, we see he is trying to consume a bunch of white orbs. Say, though, he became too full and he could no longer digest any more.

In this example, as a form of back-pressure, Pacman will signal to the system, “stop the of orbs!” so he has time to empty his stomach, and once he has room, begin to eat again.

When back-pressure is instantiated, the stream will have time to process all the data it has recently accepted, which is called its buffers.

Once the buffers are the stream will resume to accept more incoming data.

Image for post
Image for post
In this instance, the caution tape is the backpressure system, Pacman is the consumer. The source is where the orbs are being generated.

In the earlier implementation of Node.js, back-pressure was automated by utility function named .

var fileSystem = require('fs');
var utils = require('util');
var inputFile = createReadStream('./input.txt');
var outputFile = createWriteStream('./output.txt');
utils.pump(inputFile, outputFile);

This is a small, simple interface that handled a lot of things. Pump attached that were written into these native streams to be called when there was an error, or when the queue was busy.

Node.js has evolved to the point where most streams in core have unified. That means, as a developer, the interface is even easier to understand, implement, and reuse, all of which continues to promote the .

var fileSystem = require('fs');var inputFile = createReadStream('./input.txt');
var outputFile = createWriteStream('./output.txt');
inputFile.pipe(outputFile);

Finally, let’s take a look at how all of this fits together. In UNIX, communication from process to process is delegated through the kernel using signal codes. Node.js replicates this communication with the use of .

EventEmitters

Streams are built from . If you are familiar with jQuery, or the browser’s EventTargets, you will find EventEmitters to be easy to understand.

An event bespeaks its name and can be understood in a traditional perspective. Events come in two parts: a listener and an emitter.

Any time a rule, action, or parameter is fulfilled, an EventEmitter will say to the rest of the program, ‘hey! this happened!’. However, the question is: if there is no one there to listen, does the EventEmitter exist? For this reason (practical and philosophical), a listener really matters.

An event listener has a function attached to it. Every time an event is triggered, the function will execute.

A stream consists of multiple events that become triggered in succession. When are sent, they are done so by event payloads. This allows the temporal spread of how data is managed, read, and processed. Instead of overwhelming one process at one time, events allow for the slow trickle of data from one stream to another.

event: data
event: data
[ a process is busy ]
event: pause
[ wait until the buffer is drained ]
event: resume
event: data
event: data = null
event: end

Conclusion

So hopefully you have a better understanding of what streams are! Maybe you’re checking out tutorials across the web; but you might notice that there are discrepancies with different guides from different years. One might call an event that another doesn’t, yet the results are the close to identical.

Or maybe you’ve read terms like sorbeing thrown around

Are these external packages? Which one is better to use? So many questions! But fret not! The reason for all these monikers is due to the fact Node.js is constantly evolving!

Each iteration of streams tends to be drastically different from the last, or implements a cool new feature. Learning these names and what they refer to will help to lead the way in for troubleshooting your project and understand the best practices to implement each iteration.

In part two, we’ll take a look into different versions throughout the years and how they vary:

Thanks for reading :)

Node.js Collection

Community-curated content for the millions of Node.js

Jessica Quynh

Written by

Along with being a student at McGill & a software developer, Jessica holds a deep love for literature and poetry. She believes in elegant code and prose alike.

Node.js Collection

Community-curated content for the millions of Node.js users.

Jessica Quynh

Written by

Along with being a student at McGill & a software developer, Jessica holds a deep love for literature and poetry. She believes in elegant code and prose alike.

Node.js Collection

Community-curated content for the millions of Node.js users.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store