Concurrency == Golang

In this article, I attempt to analyze (explain) the different concurrency design patterns by solving the Wordcount problem in a MapReduce fashion.

Developers at Google LLC built Golang in an attempt to solve concurrency issues in a scalable manner across the distributed setup.

Wordcount problem basically demands the count of each word across multiple documents. This could be directly solved by keeping a counter for each word and incrementing the same on its occurrence. This solution is however not scalable over millions of files (sending the files’ content from each node to the master is not feasible).
Hence we apply the MapReduce paradigm to solve this.

Here the Map part would be to collect the Wordcount from each file and the Reduce part would be to digest this list across all the files to generate the final Wordcount (shuffle is implicitly handled).

Example:

File1:
the author is really really awesome

The result from the mapper:

File1:
the --> 1
author --> 1
is --> 1
awesome--> 1
really --> 2

The result from the reducer (Wordcount):

the    --> 1 + 0 + 1 = 2
author --> 1 + 0 + 1 = 2
is --> 1 + 1 + 1 = 3
awesome--> 1 + 1 + 0 = 2
really --> 2 + 0 + 0 = 2
jg --> 0 + 1 + 1 = 2

We’ll start with the most simple implementation (without exploiting the very nature of Go).

src/
-main.go
-reducer/
-reducer.go
-mapper/
-mapper.go
-utility/
-utility.go
utility.go contains some helper functions to parse the file/directory structures and extract the words.

The obvious way to optimize would be to concurrently compute the Map for each file and reduce at once. We could run the Mapper for each file parallelly (concurrently) and append the list in concurrent Go routines.

We need a mutex to navigate race conditions in lists appending.

So essentially we have run the Mapper function for multiple files in parallel (assuming multi-cores) but each transaction (appending the result) is secured with a Mutex (since the same memory is operated on by different routines).
This brings us to NO GAIN in run time optimization :(

Channels come to our rescue!

Channels are a typed conduit through which you can send and receive values. The send and receive operations are blocked before the other end is ready (hence sync without explicit locks).
So, we could use a channel to send the results computed from the Mapper and process the same through the Reducer at the receiver end.

We need the number of files explicitly to know when to stop receiving.

Now the Mappers operate in concurrence and the results are then and there sent to the Reducer for processing. This code snippet, however, demands the knowledge of the total number of files for the Reducer to act upon.
This could be resolved by closing the channel once all the send operations are done to the channel and iterating over the range of values in the channel at the receiver’s end.

To monitor the completion at the Sender’s end, WaitGroup can be used.

Observe that the final Reduced result is dumped to a channel.

Clear: WaitGroup is used to ensure all the Mapper results are sent to the channel.
Not Clear: Why is the result obtained from a channel? Why can’t we use a simple Map (pass by reference)?

Reason: Firstly, Maps are read/write sensitive. i.e. it panics when read and write operations take place simultaneously. Secondly, since the Reducer is run on a separate thread (Go routine), the final(Value) may not be the final version of the map after the entire Reducer processing is done. Both these issues are solved with a channel (the channel is receiving in main.go and the final version of the map is sent from reducer.go).

This is my analysis and explanation of some of the crucial features of Golang and its support for concurrency. Golang is synonymous to concurrency and is widely used in scaling and distributed systems (even Docker and Kubernetes are written in Go).

Hope people discover the beauty of Golang and contribute to the same!

The Startup

Get smarter at building your thing. Join The Startup’s +788K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Jayaganesh Kalyanasundaram

Written by

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +788K followers.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +788K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store