Measuring progress and estimating time remaining of long running io.Reader and io.Writer operations in Go

Mat Ryer
Machine Box
Published in
6 min readJan 2, 2018

--

Whenever we use helpers like io.Copy and ioutil.ReadAll like when we are reading from an http.Response body or uploading a file, we find that these methods block until the process is complete, even if it takes minutes or hours — and we have no way of tracking its progress or figuring out an estimate of how much time is remaining until completion.

tl;dr: This article is all leading to the progress package, which you are free to use in your own projects — https://github.com/machinebox/progress

Since io.Reader and io.Writer are interfaces, we can wrap them and intercept the Read and Write methods, capturing how many bytes have actually passed through them. With a bit of simple mathematics we can calculate the percentage complete. With a little more methemagic, we can even estimate how much time we think it has left, assuming the stream is relatively consistent.

Wrapping the Reader

A new Reader type just needs to take another io.Reader, and call its Read method capturing the number of bytes read before returning. To make the reader safe to use in concurrent environments (which is vital in this case if you think about it) we can use atomic.AddInt64 to safely increase the counter.

// Reader counts the bytes read through it.
type Reader struct {
r io.Reader
n int64
}
// NewReader makes a new Reader that counts the bytes
// read through it.
func NewReader(r io.Reader) *Reader {
return &Reader{
r: r,
}
}
func (r *Reader) Read(p []byte) (n int, err error) {
n, err = r.r.Read(p)
atomic.AddInt64(&r.n, int64(n))
return
}
// N gets the number of bytes that have been read
// so far.
func (r *Reader) N() int64 {
return atomic.LoadInt64(&r.n)
}

See if you can write the Writer counterpart on your own, it’s very similar.

Since the N method returns (safely via atomic.LoadInt64) the number of bytes read, we can call this from another goroutine at any time to find out what has happened so far.

Getting the total number of bytes

In order to calculate a percentage, we need to know what the 100% value is — how many bytes are we expecting to read?

For uploading files, we can usually as the operating system for the file size:

info, err := os.Stat(filename)
if err != nil {
return errors.Wrap(err, "cannot get file info")
}
size := info.Size()

In an HTTP context, you can get the Content-Length header value with some code like this:

contentLengthHeader := resp.Header.Get("Content-Length")
size, err := strconv.ParseInt(contentLengthHeader, 10, 64)
if err != nil {
return err
}

If the Content-Length header is empty (which can happen), then it is impossible to determine progress or estimate how long is left.

For other situations, you’ll need to figure out how you can find out the total number of bytes.

Calculate the percentage

Now we can calculate the percentage of bytes that have already been processed:

func percent(n, size float64) float64 {
if n == 0 {
return 0
}
if n >= size {
return 100
}
return 100.0 / (size / n)
}

We need to convert our values to float64 so that we don’t round the numbers to integers early in the process. We can still round off the result if we just need integer level precision.

Estimating time remaining

A very simple way to get a time estimate is to look at the amount of time it has taken so far to read X bytes, and multiply it by the number of remaining bytes.

For example, if it has taken ten seconds to complete 50% of the operation, we can assume that it is going to take another ten seconds to finish the whole task; twenty seconds total.

It isn’t entirely precise, but over time it does settle into a useful countdown.

This is the code that makes it work, but don’t worry if you don’t understand it — read below for details on our package that does it all for you.

// in the beginning...
started := time.Now()
// each time we want to check...
ratio := n / size
past := float64(time.Now().Sub(started))
total := time.Duration(past / ratio)
estimated := started.Add(total)
duration := estimated.Sub(time.Now())
  • ratio —float of the proportion of size that has already been completed
  • past—duration of how long since we started
  • total—duration based on the ratio and how long it’s been so far, what do we expect the total duration to be?
  • estimated —time when we expect it to be finished
  • duration—duration from now until the expected completion time

Meet the progress package

We ❤ open-source, so of course we’ve wrapped all this into a nice package for you to use.

It also supports io.EOF and other errors so you know what’s going on with the operation.

Ticker helper

We have also added a helper which gives you a go channel on which progress is periodically reported. You can start a new goroutine and print out progress, or update it somewhere else depending on your use case.

ctx := context.Background()

// get a reader and the total expected number of bytes
s := `Now that's what I call progress`
size := len(s)
r := progress.NewReader(strings.NewReader(s))

// Start a goroutine printing progress
go func() {
progressChan := progress.NewTicker(ctx, r, size, 1*time.Second)
for p := range <-progressChan {
fmt.Printf("\r%v remaining...",
p.Remaining().Round(time.Second))
}
fmt.Println("\rdownload is completed")
}()


// use the Reader as normal
if _, err := io.Copy(dest, r); err != nil {
log.Fatalln(err)
}

The channel periodically returns a Progress struct that has the following methods to help you figure out what’s going on:

  • Percent — gets the %age complete of the operation
  • Estimated — the time.Time when the operation is expected to finish
  • Remaining — a time.Duration of how much time is remaining

The channel is closed when the operation has finished, or when it is cancelled by the context.

Check out the documentation for a detailed up-to-date index of the API.

Example

What next?

Please try it, ask questions, report issues, submit improvement PRs.

What is Machine Box?

Machine Box puts state of the art machine learning capabilities into Docker containers so developers like you can easily incorporate natural language processing, facial detection, object recognition, etc. into your own apps very quickly.

The boxes are built for scale, so when your app really takes off just add more boxes horizontally, to infinity and beyond. Oh, and it’s way cheaper than any of the cloud services (and they might be better)… and your data doesn’t leave your infrastructure.

Have a play and let us know what you think.

Build real tools, apps and services while exploring good practices in Go.

Go Programming Blueprints: Second Edition

Buy now

--

--

Mat Ryer
Machine Box

Founder at MachineBox.io — Gopher, developer, speaker, author — BitBar app https://getbitbar.com — Author of Go Programming Blueprints