Streaming IO in Go

In Go, input and output operations are achieved using primitives that model data as streams of bytes that can be read from or written to. To do this, the Go io package provides interfaces io.Reader and io.Writer, for data input and output operations respectively, as shown in the figure below:

Go comes with many APIs that support streaming IO from resources like in-memory structures, files, network connections, to name a few. This writeup focuses on creating Go programs that are capable of streaming data using interfaces io.Reader and io.Writer using custom implementations as well as those from the standard library.

The io.Reader

A reader, represented by interfaceio.Reader, reads data from some source into a transfer buffer where it can be streamed and consumed, as illustrated below:

For a type to function as a reader, it must implement method Read(p []byte) from interface io.Reader (shown below):

type Reader interface {
Read(p []byte) (n int, err error)
}

Implementation of the Read() method should return the number of bytes read or an error if one occurred. If the source has exhausted its content, Read should return io.EOF.

Reading Rules (added)

After a Reddit feedback, I have decided to add this section about reading that may be helpful. The behavior of a reader will depend on its implementation, however there are a few rules, from the io.Reader doc that you should be aware of when consuming directly from a reader:

  1. Read() will read up to len(p) into p, when possible.
  2. After a Read() call, n may be less then len(p).
  3. Upon error, Read() may still return n bytes in buffer p. For instance, reading from a TCP socket that is abruptly closed. Depending on your use, you may choose to keep the bytes in p or retry.
  4. When a Read() exhausts available data, a reader may return a non-zero n and err=io.EOF. However, depending on implementation, a reader may choose to return a non-zero n and err = nil at the end of stream. In that case, any subsequent reads must return n=0, err=io.EOF.
  5. Lastly, a call to Read() that returns n=0 and err=nil does not mean EOF as the next call to Read() may return more data.

As you can see, properly reading a stream directly from a reader can be tricky. Fortunately, readers from the standard library follow sensible approaches that make it easy to stream. Nevertheless, before using a reader, consult its documentation.

Streaming data from readers

Streaming data directly from a reader is easy. Method Read is designed to be called within a loop where, with each iteration, it reads a chunk of data from the source and places it into buffer p. This loop will continue until the method returns an io.EOF error.

The following is a simple example that uses a string reader, created with strings.NewReader(string), to stream byte values from a string source:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/simple_reader.go

The source code above creates a 4-byte long transfer buffer p with make([]byte,4). The buffer is purposefully kept smaller then the length of the string source. This is to demonstrate how to properly stream chunks of data from a source that is larger than the buffer.

Update: someone on Reddit pointed out the previous has a bug. The code will never catch instances where a non-nil err != io.EOF. The following fixes the code.

Updated simple_reader.go with fixed error-handler

Implementing a custom io.Reader

The previous section uses an existing IO reader implementation from the standard library. Now, lets see how to write our own implementation. The following is a trivial implementation of an io.Reader which filters out non-alphabetic characters from its stream.

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/alpha_reader.go

When the program is executed, it prints:

$> go run alpha_reader.go
HelloItsamwhereisthesun

Chaining Readers

The standard library has many readers already implemented. It is a common idiom to use a reader as the source of another reader. This chaining of readers allows one reader to reuse logic from another as is done in the following source snippet which updates the alphaReader to accept an io.Reader as its source. This reduces the complexity of the code by pushing stream housekeeping concerns to the root reader.

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/alpha_reader2.go

Another advantage of this approach is that alphaReader is now capable of reading from any reader implementation. For instance, the following snippet shows how alphaReader could be combined with an os.File source to filter out non-alphabetic characters from a file:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/alpha_reader3.go

The io.Writer

A writer, represented by interface io.Writer, streams data from a buffer and writes it to a target resource as illustrated below:

All stream writers must implement method Write(p []byte) from interface io.Writer(shown below). The method is designed to read data from buffer p and write it to a specified target resource.

type Writer interface {
Write(p []byte) (n int, err error)
}

Implementation of the Write() method should return the number of bytes written or an error if any occurred.

Using writers

The standard library comes with many pre-implemented io.Writer types. Working with writers directly is simple as shown in the following code snippet which uses type bytes.Buffer as an io.Writer to write data into a memory buffer.

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/using_writer.go

Implementing a custom io.Writer

The code in this section shows how to implement a custom io.Writer called chanWriter which writes its content to a Go channel as a sequence of bytes.

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/chan_writer.go

To use the writer, the code simply calls method writer.Write() (in a separate goroutine) in function main(). Because chanWriter also implements interface io.Closer, method writer.Close() is called to properly close the channel so to avoid any deadlock when accessing the channel.

Useful types and packages for IO

As mentioned, the Go standard library comes with many useful functions and other types that make it easy to work with streaming IO.

os.File

Type os.File represents a file on the local system. It implements both io.Reader and io.Writer and, therefore, can be used in any streaming IO contexts. For instance, the following example shows how to write successive string slices directly to a file:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/file_write.go

Conversely, type io.File can be used as a reader to stream the content of a file from the local file system. For instance, the following source snippet reads a file and prints its content:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/file_read.go

Standard output, input, and error

The os package exposes three variables, os.Stdout, os.Stdin, and os.Stderr, that are of type *os.File to represent file handles for the OS’s standard output, input, and error respectively. For instance, the following source snippet prints directly to standard output:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/stdout_write.go

io.Copy()

Function io.Copy() makes it easy to stream data from a source reader to a target writer. It abstracts out the for-loop pattern (we’ve seen so far) and properly handle io.EOF and byte counts.

The following shows a simplified version of a previous program which copies the content of in-memory reader proberbs and copies it to writer file:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/io_copy.go

Similarly, we can rewrite a previous program that reads from a file and prints to standard output using the io.Copy() function as shown below:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/io_copy2.go

io.WriteString()

This function provides the convenience of writing a string value into a specified writer:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/write_str.go

Pipe writers and readers

Types io.PipeWriter and io.PipeReader model IO operations as in memory pipes. Data is written to the pipe’s writer-end and is read on the pipe’s reader-end using separate go routines. The following creates the pipe reader/writer pair using the io.Pipe() which is then used to copy data from a buffer proverbs to io.Stdout:

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/write_str.go

Buffered IO

Go supports buffered IO via package bufio which makes it easy to work with textual content. For instance, the following program reads the content of a file line-by-line delimited with value '\n' :

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/bufread.go

Util package

Package ioutil, a sub-package of io, offers several convenience functions for IO. For instance, the following uses function ReadFile to load the content of a file into a []byte.

https://github.com/vladimirvivien/learning-go/blob/master/tutorial/io/io_util.go

Conclusion

This writeup shows how to use the io.Reader and io.Writer interfaces to implement streaming IO in your program. After reading this write up you should be able to understand how to create programs that use the io package to stream data for IO. There are plenty of examples and the writeup shows you how to create your own io.Reader and io.Writer types for custom functionalities.

This is an introductory discussion and barely scratches the surface of the breath of and scope of Go packages that support streaming IO. We did not go into file IO, buffered IO, network IO, or formatted IO, for instance (saved for future write ups). I hope this gives you an idea of what is possible with the Go’s streaming IO idiom.

As always, if you find this writeup useful, please let me know by clicking on the clapping hands 👏 icon to recommend this post.

Also, don’t forget to checkout my book on Go, titled Learning Go Programming from Packt Publishing.