Improving the colStats Tool to Process Files Concurrently

Powerful Command-Line Applications in Go — by Ricardo Gerardi (55 / 127)

The Pragmatic Programmers
The Pragmatic Programmers

--

👈 Tracing Your Tool | TOC | Reduce Scheduling Contention 👉

As you noticed from the tracer output, the tool is processing files sequentially. This isn’t efficient since several files have to be processed. By changing the program to process files concurrently, you can benefit from multiprocessor machines and use more CPUs, generally making the program run faster. The program will spend less time waiting for resources and more time processing files.

One of the main benefits of Go is its concurrency model. Go includes concurrency primitives that allow you to add concurrency to your programs in a more intuitive way. By using goroutines and channels. you can modify the current colStats tool to process several files concurrently by making changes to the run function only. The other functions remain unchanged.

First, add the sync package to the imports section, which provides synchronization types such as the WaitGroup:

performance/colStats.v2/main.go

​ ​import​ (
​ ​"flag"​
​ ​"fmt"​
​ ​"io"​
​ ​"os"​
​ ​"sync"​
​ )

You’ll update the run function to process files concurrently by creating a new goroutine for each file you need to process. But first you’ll need to create some channels to communicate between the goroutines. You’ll use three…

--

--

The Pragmatic Programmers
The Pragmatic Programmers

We create timely, practical books and learning resources on classic and cutting-edge topics to help you practice your craft and accelerate your career.