Udit Tyagi
Aug 22 · 5 min read

Introduction

Buffer is an object property on Node’s global object, which is heavily used in Node to deal with streams of binary data. As it is globally available, there is no need to require it in our code.

Buffer is actually a chunk of memory allocated outside of the V8 heap. V8 is the default JavaScript engine which powers Node and Google Chrome. In Node, buffers are implemented using a JavaScript typedArray (Uint8Array), but that does not mean the memory allocated to buffer is inside of the V8 heap. It is still explicitly allocated outside the V8 heap.

So we can think of buffer as some kind of array which is a lower-level data structure to represent a sequence of binary data, but there is one major difference: Unlike arrays, once a buffer is allocated, it cannot be resized.


Buffer and Character Encoding

Whenever the data is stored in or extracted out of a buffer instance, it is simply the binary data.

Buffer without specifying encoding

As we are creating a buffer from a Node.js string (we will discuss creating buffers in a second), you can see we are getting some hexadecimal-sequence preview. It is because we have not specified any character encoding.

So whenever there is a buffer, there must be some character encoding to read back the data properly (i.e., whenever we read some content from a file or some socket, we read it as a buffer, so if we do not specify a character encoding, we will get back a buffer object).

Buffer with different character encoding

You can see that when the different character encodings are applied to the buffer, we get different results. So this is how the character encoding can change how we see our data as. If no argument is given to the toString() method, it takes ‘utf8’ encoding by default.

Different types of supported character encodings in Node.js are:

  • 'ascii' — for 7-bit ASCII data only
  • 'utf8' — multibyte encoded Unicode characters. Many web pages and other document formats use UTF-8.
  • 'utf16le' — 2 or 4 bytes, little endian-encoded Unicode characters
  • 'ucs2' — alias of 'utf16le'
  • 'base64' — Base64 encoding
  • 'latin1' — a way of encoding the Buffer into a one-byte encoded string
  • 'binary' — alias for 'latin1'
  • 'hex' — encode each byte as two hexadecimal characters

Creating Buffer

There are three most used ways to create buffers:

  1. Buffer.from()
  2. Buffer.alloc()
  3. Buffer.allocUnsafe()

Buffer.from()

Buffer.from is used to create a buffer from either an array, a string, or from a buffer itself.

Buffer.from(‘Node.js’) outputs <Buffer 4e 6f 64 65 2e 6a 73>

Buffer.alloc()

Buffer.alloc takes a size (integer) as an argument and returns a new initialized buffer of the specified size (i.e., it creates a filled buffer of a certain size).

Buffer.alloc(8) outputs <Buffer 00 00 00 00 00 00 00 00>

Here we have an 8-byte buffer, and every bit is prefilled with 0.

Buffer.allocUnsafe()

Buffer.allocUnsafe takes in size as an argument and returns a new buffer that is noninitialized. That means it can contain some old or sensitive data out of your memory. So it must be used with care. As there is no initialization involved while creating the buffer, this method is faster than the Buffer.alloc().

Buffer.allocUnsafe(8) might output <Buffer d0 ce ed 02 00 00 00 00>

We can see that there is some information left in our buffer which comes directly from our memory. In order to protect our sensitive information we need to prefill this buffer and we do that by using the fill() method.

Buffer.allocUnsafe(8).fill() outputs <Buffer 00 00 00 00 00 00 00 00>


Difference Between String/Array and Buffer

We have a length property on both strings and buffers, and it behaves the same way in both.

The output of the above code

Now, as you can see, despite the length property behaving the same for String as it does for buffer, we are getting different answers. It is because the String is counting characters based on UTF-8 encoding, and the buffer is counting the actual number of bytes used to represent the given string.

Just like arrays and strings, for buffer, we can use operations like slice, indexOf, and many others. But there are some differences when we use these methods on buffer.

Let’s say we want to apply a slice operation on the array arr.slice([begin[, end]]). Slice on an array gives us a new array with sliced elements from beginning index to end index from the original array. After slice, both the arrays will have different references (i.e., they do not share the memory, so whatever we change in either of the arrays will not affect the other one).

But this is not the case with a buffer. When we apply a slice operation on a buffer, e.g., buf.slice([start[, end]]), it also returns a new buffer, but the new buffer references the same memory as the original (just offset and cropped by the start and end indices). So that means any change we do in either buffer will be reflected onto the other one.

Output of above code

The End

Buffers are very useful when we need to read things like an image from a TCP stream, a compressed file, or any other form of binary data. Buffers are heavily used in streams in Node, so it is good to have a basic understanding of them.


Better Programming

Advice for programmers.

Thanks to Zack Shapiro

Udit Tyagi

Written by

Software Developer at K12 Techno Services

Better Programming

Advice for programmers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade