Actors and the Process Calculi: A Comparison
Elixir and Go have been gaining popularity in recent years. Both offer a breath of fresh air when it comes to writing concurrent programs, as they offer language-level support for concurrency, instead of relying on library support or low-level system calls. But the two languages are built on different paradigms, and people sometimes get confused by the differences, or wonder which is “better”. In reality, they are different models built to solve different problems. I will not be comparing the languages themselves; only using them as a vehicle to discuss the underlying models.
Elixir is just the venerable Erlang with a shiny new coat of ruby-red paint. Erlang was created by Joe Armstrong, Robert Virding, and Mike Williams in 1986 for the express purpose of enabling engineers to write distributed, fault-tolerant programs in the telecom space. Its underlying model is the Actor Model¹, invented by Carl Hewitt in 1973. The actor model is built entirely out of actors², which one can think of as nodes in a network, each carrying out their own computations. Actors all have a unique ID and, most importantly, share no data. There are just three primitive constructs for communication: actors can receive messages, initiating computation; they can send messages to other actors whose ID they know; and they can create new actors. That’s it. But as usual, less is more, and these three primitives are flexible enough to describe almost any kind of distributed system.
Go’s concurrency model is based on Communicating Sequential Processes, created by Tony Hoare in 1978. Hoare was looking for a way to model general concurrent systems and their interactions, and built off of earlier work by Robin Milner on the Calculus of Communicating Systems (CCS). Since that time, there have been many other variants of that core idea, creating a family of process calculi. The fundamental primitives in process calculi is the ability to create processes; create channels³; and send and receive messages on channels.
Note that in both cases, the computational model within actors or processes is irrelevant; it could be imperative (as in Go), functional (as in Erlang), or even object-oriented. We’re only concerned with the interactions between actors/processes.
The different design decisions in each model reflect the differing needs and constraints of concurrent and distributed systems.
In distributed computation, any node may die at any time. In particular we can’t guarantee that any message we send to another node is received. Even if we have health checks, the node may die between the last health check and the current send. Similarly, it is infeasible to wait for acknowledgement of messages. The other node may die, or the ack may be lost to the ether. So it would be unwise to build blocking sends into the core model. We can write blocking sends ourselves, but that is up to the programmer.
In a process calculi, there is no failure model. We assume that all processes are always healthy. This is a reasonable assumption when all processes take place inside a single machine. If one process dies, it is highly likely that the entire machine isn’t working. Since we assume processes are alive, we can use blocking sends. A blocking send from process P1 to P2 causes P1 to block until P2 receives the message. This enables easy synchronization between two processes.
Moreover, in concurrent systems, we can ensure that the message arrives within a bounded time. In the actor model, a message may take an unbounded amount of time to reach its destination (if it ever does). Anyone who has seen a latency spike in their application understands this is a good assumption to build into our model.
Another difference between process calculi and actors is the sending mechanism. With actors, we send the message directly to another actor, whereas processes send messages on channels, not actors.
Why is this distinction important? In an actor model, there could not possibly be channels, because they would break the “rules”. If we had some secondary construct that didn’t have to obey the actor laws (in particular, if we could create a primitive channel that would always receive messages), then we would no longer have a good model of distributed computers. And if we allow the channel to also lose messages, then in effect what we have is just another actor anyway. So we can just simulate a “channel” by creating a channel actor that listens for messages messages from one set of actors, and then forwards that message to a member of another set.
But channels are a nice abstraction if they are feasible, because they allow us to send without caring who receives a message. Process P1 can send on a channel on which both P2 and P3 are receiving a message. Whether P2 or P3 ends up receiving the message is non-deterministic, and P1 shouldn’t care. This is a feature, not a design flaw; non-determinism is the essence of concurrency (as opposed to parallelism). Symmetrically, a process P can listen on a set of channels C1, C2, … Cn at once; this is called a select statement, or choice. P will receive a message on whichever channel has a message first. Again, this is non-deterministic.
Both process calculi and actors have the ability to create new processes/actors. This enables a dynamic topology. This is essential in a distributed environment where nodes may fail and we may have to bring up replacements, or scale to meet increasing demands. In concurrency, it is useful for scaling to the number of cores, for example, or creating processes based on user-input or external events.
To sum it up in once sentence, the actor model is physically-based, whereas the process calculus is algebraic. Actors were built to model the real-world constraints of physical systems operating over large distances, based on the physical laws as we understand them (spooky action-at-a-distance not withstanding). Process calculi are built to allow nice algebraic laws and easy analysis, under the assumption that the model is sufficiently accurate on a single computer.
Note every applications wants or needs these abstractions, but I can’t wait to see more languages adopt either one as a library or — better yet — a core part of the language. Mostly, I think it’s just fun to write concurrent program. One day we will all break free from the constraints of Von Neumann and his serial machine!
1: What’s remarkable is that the creators of Erlang were unaware of the actor model. This might seem like lack of due diligence, but keep in mind the web didn’t even exist at that time, much less search engines! I think the independent discovery of essentially the same idea speaks volumes as to its significance. Great ideas are discovered, not invented.
2: Erlang/Elixir actually uses the term process instead of actor, but I will use “actor” to prevent confusion between the two different models
3: Channels are usually constrained to only transmit messages of a specified type, while actors are often dynamically-typed (untyped). This distinctions seems, as far as I can tell, immaterial, and in principle either model can admit dynamic or static-typing.