FROM THE ARCHIVES OF PRAGPUB MAGAZINE JUNE 2012
The Beauty of Concurrency in Go: Beyond “Hello World”
By Alexander Demin
It’s good to learn a new language every so often, but you have to get beyond “hello World.”
It’s good for you to learn a new programming language from time to time. This is true even if the language doesn’t take off or is ancient. Tackling old problems in a new language pushes you to rethink your current views, approaches, and habits.
I love trying new stuff, especially programming languages. But after implementing “Hello, world!” or the Fibonacci sequence in a new language, you usually feel almost nothing, get no taste whatsoever. You could try implementing the Sieve of Eratosthenes to explore, a little, data structures and maybe performance. But I wanted something real, something I could maybe even reuse afterwards. So some time ago I invented for myself a problem that helps me to get the feel of a language in just a few hundred lines of code.
The problem involves several very important elements of a language: strings, file and network I/O, and, of course, concurrency. The problem is called TCP/IP proxy (or you could call it a network debugger). The idea is, you have a TCP/IP listener (single- or multi-threaded) accepting a connection on a given port. When it receives an incoming connection, it has to connect to another host and pass through the data in both directions between the caller and the remote host. Additionally, the proxy can log the traffic in various formats to help in analyzing the data.
I stopped counting the occasions when I needed this kind of tool. Any time network programming is involved, such a tool is essential. I have implemented it many times in my life in different languages: C, C++, Perl, PHP. The two latest implementations were in Python and Erlang. It represents the kind of real problem I was looking for.
We can specify more concrete requirements. The application must serve multiple connections simultaneously. For each connection, it needs to log data in three ways: a dump presenting data sequentially in the both directions in the form of a hexadecimal dump, and two binary logs with incoming and outgoing data streams in separate binary files.
We’re going to implement the program in this article, and the language we’re going to use is Go. The Go authors claim that they designed the language with concurrency and multi-threading in its blood stream. I intend to take them at their word.
If I developed such an application in boostified C++, I would probably go for the main listener thread, plus threads for each connection. Hence, an individual connection would be fully served (I/O and logging) by a single thread.
Here are the threads I’ll use to serve each connection in the Go implementation:
- a bi-directional hex dumper thread
- two threads logging incoming and outgoing streams in the binary form
- two threads passing through data from the local host to the remote and vice versa
In total: 5 threads.
Again, five threads are serving each individual connection. I implemented all these threads not for the sake of multi-threading per se, but because Go encourages multi-threading, while C++ discourages it (even with the new C++x11 standard’s steroids). Multi-threading in Go is natural and simple. My implementation of the TCP/IP proxy in Go doesn’t use mutexes and conditional variables. Synchronization is elegantly managed by Go’s channels.
Okay, here’s the source, with explanations. If you are not familiar with Go, the commentary should help. My intention was to focus not just on the functionality of the program, but also on the Go language itself.
Let’s Go
In lines 2–11 we declare the packages we are going to use. Notably, if a package is included but not used, Go treats this as an error and enforces removing unused declarations (remember when you gave up last time and didn’t bother to clean up the list STL includes in your C++ project?)
In lines 12–16 we declare global variables presenting the command line flags. Further down we will see how to parse them.
In lines 17–20 we see the syntax of variadic function arguments in Go.
In lines 21–28 there are two functions launching the hex dump and the binary loggers. The only difference is in the log name.
In lines 29–43 the real Go fun begins. The function logger_loop creates a log file and then begins spinning in the infinite loop (lines 35–42). In line 36 the code waits for a message from the channel data. There is an interesting trick in line 34. The operator defer allows us to define a block of code guaranteed be executed at the end of the function scope (similar to finally in Java). If empty data is received, the function exits.
In lines 55–88 there is a function that reads data from the source socket from, writes to the log, and sends it to the destination socket to. For each connection there are two instances of the pass_through function copying data between the local and remote sockets in opposite directions. When an I/O error occurs, it is treated as a disconnect. Finally, in line 79 this function sends the acknowledgment back to the main thread, signaling its termination.
In lines 81–107 there is a function processing the entire connection. It connects to the remote socket (line 82), measures the duration of the connection (lines 88, 101–103), launches the loggers (lines 93–95) and finally launches two data transferring threads (lines 97–98). The pass_through functions run until both peers are active. In lines 99–100 we wait for acknowledgments from the data transferring threads. In lines 104–106 we terminate the loggers.
In lines 108–132 is the main function running the TCP/IP listener. In line 109 we ask the Go runtime to use all physically available CPUs.
This is it, just 132 lines. Please note: we used only the standard libraries, coming out of the box.
Now we are ready to run:
go run gotcpspy.go -host pop.yandex.ru -port 110 -local_port 8080
It should print:
Start listening on port 8080 and forwarding data to pop.yandex.ru:110
Then you can run in another window:
telnet localhost 8080
and enter, for instance, USER test [ENTER]
and PASS none [ENTER]
. The three log files will be created (the time stamp, of course, could be different in your case).
Bi-directional hex dump log log-2012.04.20–19.55.17–0001–192.168.1.41-49544–213.180.204.37–110.log:
Binary log of outgoing data log-binary-2012.04.20–19.55.17–0001
-192.168.1.41–49544.log:
USER test
PASS none
Binary log of incoming data log-binary-2012.04.20–19.55.17
-0001–213.180.204.37–110.log:
+OK POP Ya! v1.0.0na@26 HtjJitcPRuQ1+OK password, please.-ERR [AUTH] login failure or POP3 disabled, try later. sc=HtjJitcPRuQ1
It seems to work, so let’s try to measure the performance by downloading a bigger binary file directly and then via our proxy.
Downloading directly (file size is about 72MB):
time wget http://www.erlang.org/download/otp_src_R15B01.tar.gz
...
Saving to: `otp_src_R15B01.tar.gz'
...
real 1m2.819s
Now let’s start the proxy and then download through it:
go run gotcpspy.go -host=www.erlang.org -port=80 -listen_port=8080
Downloading:
time wget http://localhost:8080/download/otp_src_R15B01.tar.gz
...
Saving to: `otp_src_R15B01.tar.gz.1'
...
real 0m56.209s
Let’s compare the results.
diff otp_src_R15B01.tar.gz otp_src_R15B01.tar.gz.1
It matches, which means the program works correctly.
Now the performance. I repeated the experiment a few times on my Mac Air. Surprisingly, downloading via the proxy worked for me even a bit faster than directly. In the example above: 1m2819s (directly) vs 0m.56209s (via proxy). The only explanation I can imagine is that wget is single-threaded, and it multiplexes incoming and outgoing streams in one thread. In turn, the proxy processes the streams in the individual threads, and perhaps this causes a tiny speedup. But the difference is small, almost negligible, and maybe on another computer or network it would disappear completely. The main observation is: downloading via proxy doesn’t slow things down, despite the additional overhead of creating quite massive logs.
In summary, I’d like you to look at this program from the angle of simplicity and clarity. I’ve pointed it out above but I’d like to underline it again: I had started using threads in this application gradually. The nature of the problem gently pushed me to identify concurrent activities in processing a connection, and then the ease and safety of concurrency mechanisms in Go had finished it off, and eventually I used concurrency without thinking about the efficiency vs complexity (and difficulty to debug) trade-off.
Agreed, sometimes a problem simply needs to thrash bits and bytes, and the linear efficiency of the code is the only thing you care about. But more and more you encounter problems where the capability of concurrent, multi-threaded processing becomes the key factor, and for this kind of application, Go will shine.
I hope this serves for you as a representative example showing off the ease and even beauty of concurrency in Go.
About Alexander Demin
Alexander Demin is a software engineer and Ph.D. in Computer Science. Constantly exploring new technologies, he believes that something amazing is always out there.