Being and Time
Exploring state in Elixir
State is the root of all evil…. While state in programming languages is undesirable, in the real world state abounds. (J0e Armstrong, in “Why OO Sucks”)
Yet, even when writing Elixir, we must interact with the real world.
Functional languages are functional. It’s a paradigm I’ve come to love, but when I first started surveying functional languages (Elixir, Erlang, Haskell, Clojure), I wondered how anyone got anything done in a language without state at its core.
What is “state”? It’s the contents of a data structure at a point in time. Conceptually, it’s simple.
But time moves on, and state wants to change. In object-oriented languages the relationship between state and the methods that manipulate that state can quickly become mysterious and complex. The result is code that is difficult to trust.
Functional languages like Elixir push the handling of state to the edges, rather than making it a central concern. It’s important, to be sure, but it becomes a necessary evil and gets special attention and handling. In Elixir, state becomes many moments in time, each replacing the last.
The purpose of this article is to explore two ways of handling state using the power that OTP brings to the Elixir ecosystem. First I’m going to demonstrate how we can maintain the state of a poker player using the Elixir GenServer module. Then, I’ll contrast that against doing the same using Elixir’s purpose-built Agent abstraction.
As its name makes explicit, GenServers are “generic” — a common building block. They can be used to maintain state, sure, but they can also execute long-running tasks, monitor other processes, and supervise other processes and restart them when they fail. The GenServer is one important pillar of the OTP architecture. Elixir provides an abstraction atop the Erlang implementation of gen_server with its own GenServer module.
There’s a Github repository with the code for a basic poker player GenServer here, if you want to see the complete implementation. In that code, there are two important sections: the public API calls and the internal server callbacks that manage the state of the GenPlayer implementation.
The take_seat/1 function is the entry point, called when GenPlayer starts. It takes the player’s name calls the init/1 callback creating the map that is the data structure representing the poker player. The initial state is also set: the player has a name, zero chips, and no cards.
Looking at the code for the player, you’ll notice the name/1, stack/1, and cards/1 functions that give external callers information about the player. Those functions make callbacks to server-side handle_call functions which return the values requested and then set the player’s state without mutating it.
More interesting is the deal_cards/3 function.
This function calls back to a handle_cast/2 function meant to deal with making sure the player gets her cards. Imagine you’re sitting at a poker table and the dealer is flipping a card to each player, not really caring that the player received the card. By the time the player has received the card, the dealer is on to the next player. This function accepts the cards and sets the player’s state to represent her hand now contains them.
Similarly, the buy_chips/2 and bet/2 functions update the state of the player’s stack by adding and subtracting chips, respectively. The bet/2 function adds a little sugar by returning the player’s chip amount after the bet is made — this may be useful to some AI routine making decisions about that player’s next action or to some other GenServer responsible for displaying the player’s new chip count on a GUI.
The core principle to remember is that the GenServer holds the player’s state at all times, and all changes to that state go through a clearly defined public API which has an explicit set of server callbacks. And, when a function is executed the player gets an entirely new state based on the prior state. The data structure representing the player is never mutated in-place. Time moves on, and the state moves with it.
Take a moment and look over the lib/gen_player.ex code. There’s a lot of formal API/callback stuff going on. But try looking at the underlying GenServer module implementation sometime, to get an idea of how the Elixir team has abstracted away a lot of the underlying Erlang complexity. For this, we should be grateful. But they’ve given us even more….
While GenServers are generic, Agents are quite specific. Their job is to hold state, and make it available to other processes. The AgentPlayer code that represents our poker player has the exact same public API as the GenPlayer implementation. The test suites differ only in that one uses GenPlayer and the other uses AgentPlayer.
The take_seat/1 function doesn’t have to callback into an init/1 function, it just accepts the call and sets the player’s state. Similarly, the other functions like deal_cards/3 and bet/2 set the state without having to implement explicit callbacks.
Under the Agent abstraction, the Elixir team has hidden the complexity just as they hid the complexity of Erlang under the Elixir GenServer module.
The total number of lines of code in the Agent version is about half of the GenServer version. Reducing number of lines of code, in itself, is not the goal though. The goal is to reduce the cognitive load on the programmer and anyone else who comes along and has to reason about the code. Reducing the amount of code helps significantly by reducing the logic that needs to be understood.
If you look closely, you’ll see GenPlayer and AgentPlayer aren’t precisely equivalent. An earlier version of AgentPlayer.update_cards used Agent.update. One of the EMPEX organizers, Cameron Price, suggested that code may not be doing what I thought it was doing. Sure enough, looking at the underlying code, update wraps GenServer.call. I really wanted a GenServer.cast.
This may not be a big deal in this example, but imagine simulating millions of poker games with tens of millions of hands dealt. Using call, rather than cast, will be a bottleneck. A switch from Agent.update to Agent.cast took care of that issue.
There’s a more interesting difference between GenPlayer and AgentPlayer: the function setting the player’s stack size based on bets made and chips bought. Both types of player have a function called adjust_stack/2. For GenPlayer, the decision on whether to use call or cast is made by the server callback. In the AgentPlayer, there is no place to make that choice — the adjust_stack function calls Agent.get_and_update, which beneath the abstraction uses GenServer.call.
Again, maybe that isn’t a big deal in this example, but it does cause AgentPlayer to block where GenPlayer does not. The solution to this is to create a separate function to handle the buying of chips with a cast, if we really want to replicate the GenPlayer behavior. Since this is instructive, I left the synchronous get_and_update as-is.
I chose Agent.update because I wanted to update the state of AgentPlayer — it sounded like the right choice. And I used Agent.get_and_update for both purchasing chips and betting because I’m conditioned to avoid logic repetition. This is a great example of the importance of understanding the underlying implementation of abstractions and, in some cases, the tradeoffs that come with using those abstractions.
For a thorough tutorial on how to integrate an Agent into a larger OTP application, you should follow the “Mix and OTP” user guides. You start with a simple GenServer and, over the course of several lessons, build up to a fully supervised umbrella application using Agents to resiliently store data into named lists of key/value pairs.
If you need a bucket in which to hold the state of some entity in your system and have it be accessible from many different processes, consider using the Agent module. There are other differences between Agent and GenServer, that make the latter more useful, and we will explore that in future articles diving into the GenServer module and other OTP topics.
A big thanks to Cameron Price, David Antaramian, and Desmond Bowe for feedback on the drafts of this article. And especially for the pointers and suggestions that forced me to look at the underlying implementations of GenServer and Agent modules.