Go concurrency considered harmful

Go has been gaining a ton of popularity as of late. I’ve been using Go for work, working on container management software. Before that, I was writing Erlang, as part of DC/OS. Golang is an incredibly useful, productive language and runtime. It’s almost entirely replaced my usage of Python, and Java. Its “systems-like” language quality makes solving a certain class of problems a breeze, where I’d normally have to resort to C++, a language which has found itself falling out of favour amongst certain circles due to its history, complexity, and lack of safety.

After writing Go in anger for more than year, at two different jobs, I find myself often frustrated with Go. I’ve found myself attracted to other languages, like Javascript, with Flow, and runtimes like Chrome, or Node. While Erlang, nor Elixir have not become anywhere as popular as Go, they still have some incredibly good programming concepts that I miss. Recently, I read a set of Go mantras that left me somewhat frustrated.

Concurrency At Work: The Map

The Map is an excellent summarization of concurrency problems. At the heart of nearly every program I’ve worked at, there’s some state tracking system. In my current one, there’s a map of task ID to a Container struct. Queries may come from various sources, in various forms. We expose an HTTP API to enumerate these containers, another RPC API to manipulate and modify them, and lastly the containers themselves are backed by a set of real unix processes that need maintenance and monitoring. Manipulation to this datastructure needs to be concurrent, and safe.

Unfortunately, trying to explain this problem isn’t a particularly fruitful exercise, as we can quickly get into the weeds of business semantics. It uses locks for safety, rather than channels because locks are easier and simpler. Unfortunately, locks don’t provide safety and clean concurrency boundaries like channels. Because of this, we’ve seen problems with synchronization around this data structure — deadlocks, accidental sharing (stashing mutable pointers in values), and a wrapper API which over complicates things. Fortunately, all of this can be summed up as a Map, but with business-specific methods, and reactive APIs.

One of the few data structures that Go offers is a map. Unfortunately, few data structures in Go can be used concurrently without locking. Given how easy it is to introduce concurrency into a Go program, the lack of simple concurrency control mechanisms is often problematic. People often end up writing their own synchronization around access to a map, slices, and other general purpose data structures.

Sharing by Communicating is a Lie

The Go Mantra, “Share by Communicating”, may allow you to write correct code, but in practice, it’s challenging. I implemented a map with the following interface to see how it felt:

type ConcurrentDict interface {
// SetKey sets the key to val, val no matter if it exists or not
SetKey(key, val string)
// CasVal performs an atomic-compare-and-swap operation for a given key. setOnNotExists overwrites the "nil" value
CasVal(key, oldVal, newVal string, setOnNotExists bool) bool
// ReadKey returns the given value associated with a key, and if it exists in the map or not.
ReadKey(key string) (string, bool)
// DeleteKey removes a key from the map. If the key does not exist, we perform the operation without error
DeleteKey(key string)
}

The implementation with channels clocks in at 118 lines. It’s based on the model where each map is held by its own Goroutine, and access to that map is serialized via channels that provide a request / response mechanism.

Unfortunately, in Go, there doesn’t exist a way to have “relationships” between goroutines. Therefore, you have to use Go’s context package in order to maintain a relationship between the spawning Goroutine, and the child Goroutine. If you don’t you can very easily leak Goroutines.

Leakin’ Goroutines.

Unfortunately, you have to manually trigger garbage collection of the context, because just relying on a child context will not work. This example will fail non-determinstically :

func TestBreak(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
d := NewChanDict(ctx)
d.ReadKey("foo")
/* Simulate upstream request failure */
cancel()
d.ReadKey("foo")
}

In this example, there is a race between the time where the context is cancelled, and the request is fulfilled. If the run() loop selects the context.Done() prior to service the ReadKey, and exits, there will be a problem since the parent cannot preempt the worker. In this solution, the straight forward approach is to have ReadKey return a value, along with an error, so we can check if the map has been cancelled or not. This introduced a lot more unnecessary error checking. So, the solution is to use finalizer. Finalizers themselves are not particularly easy to test, nor are they considered to be idiomatic Go.

The worst part about this is that there is no guarantee about concurrency-safety, if we allow value type which can be passed by reference. In our example, we use string, as the value type, which is immutable. Alternatively, if we wanted to use slices, the user passing the slice could accidentally mutate it, after fetching it from the map, therefore introducing a violation of the serialized concurrent access invariant. This is confusing, and it can lead to a violation of Go’s concurrency model without any knowledge of what went wrong. Go’s answer to this is the Go Race Detector. The Go Race Detector comes at a great memory, and performance penalty, therefore most users do not enable it in production, and only use it in their testing conditions.

On the other hand, traditional mutexes significantly reduce the amount of code required. An implementation using mutexes clocks in at only 51 lines, less than half the channel implementation. I sit here writing this, and I ask myself, why am I doing this to myself? Why don’t I just use locks? They look so much more reasonable. Alternatively, if Go actually wants us to “Share By Communicating”, they should introduce reasonable constructs.

But SyncMap?

Recently, Go introduced syncmap, a native implementation of a concurrent map. Although, this alleviates some of the discussed problems, it’s too little, too late. For one, the only types it knows about are interface{}, requiring you to shed speed and type safety if you want to use it. On top of this, it still doesn’t prevent the accidental storage of mutable references. While, I’m no enemy of generated code, the lack of a real solution is frustrating.

My Erlang Dictionary

Erlang also “Shares By Communicating”, but on the other hand it provides some amount of ergonomics. They feel surprisingly similar at first, and one could almost drop in “Erlang Process” for “Go Routine”. But Erlang concurrency is based upon the Actor model, and it has a few attributes which are valuable to understand prior to approaching concurrent programming in Erlang.

  1. Data is always passed by value, never by reference: Every time there is communication between two processes in Erlang, the data is copied from one process to the other. This is in part because Erlang does not have pointers.
  2. Processes have unique identifiers: Each process in Erlang gets a pid, which uniquely identifies the process, during its lifetime, and the node which it is running on. There are mechanisms in Erlang called monitors, and links that allow for explicit relationships between processes. This allows the developer to say a process is dependent on on another, and their fates should be shared. There’s much more nuance that’s covered in “Learn You Some Erlang
  3. Each Process has its own private heap. This means that there is no way for concurrent manipulation of data without communicating.

The Erlang implementation of the dictionary clocks in at 131 lines. The concepts that we’re implementing map very well to Erlang’s gen_server, so we’re required to implement the full set of gen_server callback to take the best advantage of it. This also gives you neat features, like Erlang’s Observer, which allows you to view the relationship between processes, hot-code reloading, and state introspection.

The Guts of the code really lie in these two sections:

This is the public API declaration. It states which functions should be exported, what types, and their type signatures.

%% API
-export([
set_key/3,
read_key/2,
delete_key/2,
cas_val/5
]).

-type my_dict() :: pid().
-type key() :: term().
-type value() :: term().

-export_type([my_dict/0, key/0, value/0]).

-type state() :: #{}.

%%%=================================================================
%%% API
%%%=================================================================
-spec(set_key(MyDict :: my_dict(), Key :: key(), Value :: value()) -> ok).
set_key(MyDict, Key, Value) ->
gen_server:call(MyDict, {set_key, Key, Value}).

-spec(read_key(MyDict :: my_dict(), Key :: key()) -> {ok, value()} | not_found).
read_key(MyDict, Key) ->
gen_server:call(MyDict, {read_key, Key}).

-spec(delete_key(MyDict :: my_dict(), Key :: key()) -> ok).
delete_key(MyDict, Key) ->
gen_server:call(MyDict, {delete_key, Key}).

-spec(cas_val(MyDict :: my_dict(), Key :: key(), OldVal :: value(), NewValue :: value(), SetOnNotExists :: boolean()) -> boolean()).
cas_val(MyDict, Key, OldVal, NewVal, SetOnNotExists) ->
gen_server:call(MyDict, {cas_val, Key, OldVal, NewVal, SetOnNotExists}).

This is the actual logic that implements our state transitions. For those who have used toolkits like Redux, you’ll feel right at home. It is based on the concept that the application takes State₀ × Action → State₁ & Reply.

%%%================================================================
%%% Internal functions
%%%================================================================

-spec(handle_set_key(Key :: key(), Value :: value(), State0 :: state()) -> {reply, ok, state()}).
handle_set_key(Key, Value, State0) ->
{reply, ok, maps:put(Key, Value, State0)}.

-spec(handle_read_key(Key :: key(), State0 :: state()) -> {reply, {ok, value()} | not_found, state()}).
handle_read_key(Key, State0) ->
case maps:find(Key, State0) of
{ok, Value} ->
{reply, {ok, Value}, State0};
error ->
{reply, not_found, State0}
end.

-spec(handle_delete_key(Key :: key(), State0 :: state()) -> {reply, ok, state()}).
handle_delete_key(Key, State0) ->
{reply, ok, maps:remove(Key, State0)}.

-spec(handle_cas_val(Key :: key(), OldVal :: value(), NewVal :: value(), SetOnNotExists :: boolean(), State0 :: state()) -> {reply, boolean(), state()}).
handle_cas_val(Key, OldVal, NewVal, SetOnNotExists, State0) ->
case maps:find(Key, State0) of
{ok, OldVal} ->
{reply, true, maps:put(Key, NewVal, State0)};
error when SetOnNotExists == true ->
{reply, true, maps:put(Key, NewVal, State0)};
_ ->
{reply, false, State0}
end.

Does it Leak? Where’s the GC?

There’s a test included which attempts to force a garbage leak by creating a new dictionary, filling it with random data, and checking if the VM heap was still inflated after the test. It passes, and yet nowhere in the code do we see any signs of garbage collection, or explicit free. This is because of the power of Erlang, and links. Because each test is run in its own process, and the dictionary is automatically linked to the process, when it finishes, or dies, the dictionary also dies.

In Erlang, processes typically live under supervision trees. You can think of supervision trees kind of like init systems like upstart, or systemd. The dictionary can be owned be a top-level supervisor. One interesting capability of supervisors is that they can have child supervisors. These child supervisors can then own their own processes.

Supervisors have restart policies. For example, the bottom supervisor has the policy one_for_one, which means that when one of the processes dies, it’s considered to be independent of all other processes. On the other hand, the top supervisor is one_for_all. That means if the dictionary terminates, or the child supervisor fails, it’ll kill all children.

Where does this take us?

Go is a super neat language that seems to have found an ever-growing number of programmers adopting it. Its approachable type system, and concurrency semantics make it perfect to be immediately productive. Unfortunately, as the language, and its usage has evolved, these semantics have fallen short for real world scenarios.

In my perfect world, we would adopt Erlang/OTP. Unfortunately, pragmatism suggests that this future is unlikely. Instead, as we develop Go 2.0, we should try to take lessons from languages that came before it. I intend to write two additional articles on what we can learn from Erlang’s error handling semantics, operability, and type system. If there is enough interest, I’ll turn them into experience reports.

Sharing by communicating is an awesome idea. I would dare to say that it’s the right approach, but it’s hard to do without help.The things that would have helped me here, would be a type system that prevents mutable types from being passed as a value over the channel, and ensuring that my code is concurrency, and type safe. Another nicety would be goroutine IDs, and termination (preemption) semantics. This way, a parent process could preempt a set of goroutines, rather than having to come up with hacky synchronization points.