Streams

Justin Michot
Frontend Weekly
Published in
5 min readJan 31, 2018

NodeJS

The internet is based on the concept of transferring data (or information) from point A to point B. Everything that you do online is, at some point, is reliant on data being sent from one computer to another computer. This data is transferred via streams.

Although from a programmer’s perspective the complexity of streams is mostly handled “under the hood.” However, it is worth knowing more about streams. In fact, in almost any situation in which data is being transferred from one place to another, the programmer has the ability to customize several aspects of the stream. Let’s take a look at what streams are and I’ll illustrate a couple of examples in which streams are specifically used in NodeJS.

Without Streams

Without streams large data transportation is very inefficient. That is, if you have a large file the entirety of that file must be read before it can be transported. And then upon arrival the entirety of that file must be written into memory before any of the data can be accessed. Consider the example below from NodeJS:

The code above does not use streams. And if dealing with small amounts of data the point will be moot. However, let’s dive deeper into what streams are, why are they so powerful and what can be accomplished by using them.

Intro to Streams

Streams transport data chunk by chunk. This data is typically read from file-A and/or is written to file-B. The concept of reading and writing data is central to the transport of data. Specifically, transporting data via streams through the process of reading and writing data, chunk by chunk, from one place to another is prevalent online.

Streams allow for the efficient use of memory space. For instance, if ultimately one large file is being sent from point A to point C, but needs to first be sent to point B for encoding, then streams will allow for point B to receive data from point-A chunk-by-chunk, handle the encoding of one chunk at a time, and send that chunk on to point C without ever having to store the entirety of the file in memory at point B. This concept encapsulates the power of streams. Point B may not have enough memory space to store the entire file at one time, but via streams the entire file can still be encoded at Point B and sent to Point C as long as Point B’s memory can hold one chunk. The size of this chunk can vary and be manually adjusted by the programmer.

There are four types of streams:

  1. Readable (“source”)
  2. Writable (“destination”)
  3. Duplex (“readable & writable”)
  4. Transform (modifies the data midstream)

The Concept of Source and Destination

Readable and Writable streams are the foundation of understanding streams. Readable streams can be thought of as the source stream. That is, where the data being transported comes from. Furthermore, writable streams can be thought of as the destination stream. That is, where the data being transported is going. Consider the following example in NodeJS:

The createReadStream method from the File System module is used to stream data, chunk-by-chunk, from filePathA, and then the pipe method pieces together the data and uses createWriteStream to send the data chunk-by-chunk to filePathB. It is possible to stream data using only the createWriteStream (given that you are creating data on the fly). Also, it is possible to only use a readable stream (given that you are not transporting the data to any other destination). However, the code above would not work if you replaced createReadStream with createWriteStream and/or replaced createWriteStream with createReadstream. Hence, it may be helpful to think of streams in terms of readable/source streams and writable/destination streams.

Duplex Streams

A duplex stream is one stream with two channels. One Channel sends data and the other receives data. Each channel has its own buffer. That is, both channels can transport data simultaneously. Consider this pattern: a.pipe(b).pipe(a). This type of pattern is made possible via a duplex stream. That is, one connection between two computers wherein each computer can send and receive data from the other. The two channels allow each application to act as the source and/or the destination within the same stream. Duplex Streams are by far the most common type of stream online. Network Sockets use duplex streams to send and receive data. Think about online gaming and how multiple devices are sending and receiving data with each other simultaneously. Also, duplex streams are used to connect almost all network devices via TCP (Transmission Control Protocol) and IP (Internet Protocol).

Transform Streams

A transform stream is a stream that modifies the data being transported midstream. Two modules in NodeJS that can be used to illustrate this transformation are the zlib and crypto modules. Consider the example below:

This example is reading from point A and writing to point B. However, the data is compressed (createGzip) and encrypted (createCipher) along the way. Then the compressed/encrypted data is streamed from point B to point C. However, the data is deciphered and decompressed along the way. Hopefully, this clearly illustrates the concept of a transform stream. Please note that transform streams are conceptual. That is, convention describes the above example as a transform stream but no where in the code is it required to declare a transform stream. The same methods that can be used in isolation (createReadStream/createWriteStream) can be used to implement the source/destination for a transform stream. Overall, understanding the power of streams and role that streams play when transporting data is important from a programming perspective.

--

--