Status Machina: Writing arguably better code when you have a field called status (or type)

Don Abrams
9 min readSep 6, 2016

--

TLDR; Any object with a status field is a state machines. By making state machines explicit in code you can eliminate bad state and certain classes of errors all the while making programs more human understandable. There are open questions around making state machines explicit in code, in both languages design and frameworks. JS Promises are a dead end.

Complicated magic

Anywhere you have a field named status, you have a state machine. State machines are everywhere. But what are they? How do I know one when I code one? What’s NOT a state machine? Most importantly, when I code one, how do I make them explicit to future developers? What are the common mistakes and trade-offs when creating a state machine? Are there languages or libraries that make my code easier to write, change, and understand?

What’s a state machine?

For the purposes of this writing:

state machine : An object or process with a status. This status may change based on some input.

Here’s s0me examples:

Bare bones state machine in Java
Bare bones state machine in JavaScript
Bare bones state machine in Haskell

Here’s another JavaScript example: a redux reducer. Redux uses a relatively explicit state machine pattern.

A redux state machine pattern

Transitions

A transition is a state machine’s reaction to input. Really, it means you changed the status, probably because someone clicked something or a clock ticked.

In all of these examples, calling publish() transitions the state to published and calling unpublish() transitions the state to unpublished. However, there’s a possible problem with this. We can’t unpublish() drafts. What’s the worst that can happen?

Only you can prevent forest fires, and bad states!

It’s almost always a good idea to be explicit about flow, and state machines are no exception. In every example above, every state could move to published or unpublished. If this was not desired, every program above has a bug. Did you see it and question it? If not, why not? Was there a way to make it more explicit?

State machine diagrams

If we had drawn this up for the above post we would have seen the transition from draft to unpublished with a glance. Is this more explicit? How can this relative clarity be transferred to code, an inherently linear medium?

state machine diagram for examples above

In JavaScript, the most famous state machine is a promise. Promises have three states (pending, resolved/fulfilled, and rejected) and two transitions (resolve/fulfill() and reject()). Here’s a great illustration of the states and transitions of a JavaScript promise.

JS Promise state machine from https://blog.codecentric.de/en/2015/03/cancelable-async-operations-promises-javascript/

Every promise starts out Pending. Then it may either be fulfill()-ed or reject()-ed. Once Fulfilled or Rejected, it never transitions again. Notice the arrows for the transitions are one way.

Acceptable and unacceptable states

The double circles represent “acceptable” states. The diagram above is wrong; since a promise may stay “Pending” forever, “Pending” is indeed an acceptable state and should be double circled. So, what is an unacceptable state?

Practically, an acceptable state needs no cleanup; the machine can just be terminated. Unacceptable states need cleanup via another transition; maybe a DB connection needs to be closed, a request needs to be cancelled, or an animation needs to complete. Not recognizing unacceptable states is a huge source of bugs. This is hard, since 95% of states in everyday code are acceptable. Rather than merely circling unacceptable states, it would be better if emphasized them by triple circling with red sharpies, yellow caution tape, and helicopters with spotlights.

You know what happens when you assume…

There are certain assumptions underlying a state machine.

  1. An object is in one and only one state. An object is never “between” states; a state change is atomic.
  2. You know what states the machine can be in (read: I can enumerate what statuses the object can have).
  3. Methods and functions may change their behavior (or are not valid) based on what state the machine is in.

Sadly, these assumptions are often not true of “implicit” state machines.

It’s usually hard to know what states are valid and what states the object is in. Often an object will have several flags such as isPublished, isDraft, or isUnpublished and each method that sets those flags is expected to unset the other relevant ones. We make mistakes and don’t always update all the flags, especially as the number of states increases. This puts an object into a “bad state”. Also, there are unanswered design decisions: Can a post be both a draft and unpublished? Are they the same state? The design of states is important to understand and reason about a program and should be more explicit than a smattering of isDraft == true && isPublished == false.

Checkpoint

OK, so far we should agree that:

  • While writing or reviewing code, thinking the word “status” should trigger an overwhelming response to yell STATUS MACHINA as loud as you can (causing your PM to look up at you, assume it’s some anime series, and go back to work).
  • Having a bunch of isX flags is likely a bad idea if and only if there are invalid combinations. There’s likely a state machine “hiding” in the code and invalid states in your future.

Implicit state machine problems

If having a maintainable code base is important to you, then the following is likely true.

  1. When a machine is in a certain state, I know what methods can be called on it’s associated object and what methods have different behavior depending on the state.
  2. I know where the code that transitions the state machine lives and how it’s triggered.

For a object with a given status (read: with a state machine in a given state), only certain actions and transitions are “allowed”. Often, to really know if a method or function only acts on certain states, you usually have to go into that method. Especially if your state isn’t explicit (ex: isApproved && publishDate < now() && !isUnpublished), knowing WHEN a method does something can involve some real mental gymnastics. It’s probably best to simplify these conditionals to “does this object have a state machine with this state?”

After a method is executed on an object, does the status of the object change? Figuring that out is hard and requires inspection of the method implementation. I see this as a responsibility in addition to whatever else the method does. Likely, transitioning the state machine and doing whatever the method does at the same time is a violation of the single responsibility principle. Plus, if you have two things that happen when a post is published, which one has the responsibility of transitioning or setting the published state? In what order? Does order matter? Is there a hidden unacceptable state? These are all things we have to consider, preferably at design time, and should explicitly know when we read good code.

Where do we put methods that operate on state machines?

In the examples above, we put them on the object itself or in a method that returned a new object. What’s the likelihood of accidentally calling post.publish() an already published post? We’ve all had a double submit bug. At best it’s idempotent, at worst you accidentally double your CDN expenses or half your capacity.

Say you move those methods off the object and into a Publisher class or library. This avoids the previous problem. If the method returns a brand new object with a new state, you can even call this method “functional,” though I’d just call it immutable. But what did you lose? You no longer know what methods are acting on a Post. Every method that wants to publish a post needs to know what methods to call when they do so. Also, you no longer know if some other method is already transitioning the state: In a multi-threaded environment, two methods may both be setting the status, causing a race condition.

Additionally, when methods have the responsibility for transitioning a state, adding a new state is likely to break all the existing code. Look at any discussion on cancel-able promises to see how having an implicit state machine made it nearly impossible to add a new cancel state/transition. ES2016's async/await will have the same problems. If, instead of promises, JS had language support to define state machines, attach to transitions, and nest state machines, adding a cancel transition/state would be much easier (and progress events would be orthogonal and also simpler to implement).

What makes a state machine explicit in code

When a state machine is explicit in code the following is true:

  1. An object is in one and only one state
  2. I know what statuses the object can have (aka I know what states are in the object’s state machines)
  3. I know exactly which methods are allowed on what states
  4. I can assume an object is never “between” states; a state change is atomic
  5. I know what methods will always change a state
  6. I know what methods may change a state and on what basis

In real world systems, it’s useful for methods to “piggyback” on state changes; i.e. when a post is publish()-ed or unpublish()-ed, the cache is invalidated and timelines are updated. This is one of the useful parts of Java’s Aspect Oriented Programming (AOP) (though it’s often abused because, without the proper IDE plugins, important side effects are often missed or hard to reason about). In JavaScript land, a Promise’s then() method is used to compose 2 state machines together, making success linear and failure handle-able at different breakpoints.

Languages, frameworks, and libraries that support explicit state machines

I find it surprisingly difficult to define explicit state machines in most languages.

Erlang is a language that is pretty much designed around state machines. Each process is it’s own state machine, takes up room on the heap, and has a mailbox. A “method” call is a message sent to another process (which may or may not send a response). Timeouts are built into the language! Erlang is designed to be as error tolerant as possible by designing for failure and minimizing bad states as a result. (Thanks for reminding me of this Scott Messinger)

For an OOP language, Ruby has the best designed state machine library I’ve seen, aasm. It’s sadly not easy to port over to other OOP languages as it relies heavily on proxies to work (dynamically creating methods on the fly when users “call” them).

To get the same rigor without proxies, dynamic languages passing status atoms (aka global constants) around and using large switch statements. I recently ran into a decent library for redux, a javascript framework, that allowed for explicit state machines named redux-machine. Redux itself is one big state machine, though without redux-machine, doesn’t have a way of ignoring actions without custom code on each action handler and pushes off this responsibility to a dispatcher layer (where functions leveraging redux-thunk and redux-saga are forced to derive the status or store it themselves).

Typed Functional languages often use immutable data structures, union types, and pattern matching to make state machines “explicit,” though the level of explicitness is debatable, especially if pattern matching isn’t forced to be exhaustive.

Nesting state machines

Often, an object in one state has another state machine embedded in it. Consider a page loading. Maybe, before it’s loaded, a user clicks on another page. Was there a page loading state? Did the user action trigger a transition? Do we need to undo or redo a transition animation? Does that animation have it’s own state machine?

Properly subdividing nested state machines is all about doing whatever keeps it simplest for the next person. Assume that any method that operates on the object should need to understand EVERY possible state.

I should probably write some explicit examples of nested state machines, but I’ll leave that to a later post (if people request it).

Other things you should know

All regular expressions are state machines. See this article for more information. I didn’t bother mentioning this above, because taking characters as input one at a time isn’t in the normal web programmer’s day.

Questions

Are there better techniques for making state machines explicit than the approaches taken by erlang, redux/redux-machine, or aasm? My gut says yes. Will it need language support and what would that look like? Oi, I don’t know. The closest thing I could find was STRIPS but I didn’t immediately see a way to transfer that to a more traditional programming languages.

Nothing we do will make code with state machines provably correct (https://www.youtube.com/watch?v=dWdy_AngDp0 is a must-watch, though super heady). All this stuff is designed to decrease complexity for humans. Would succeeding at building an explicit state machine really decrease the complexity or is procedural thinking more “human” and inherently less complex? Is PHP a good language or is this a bad example?

--

--

Don Abrams

I make software; mostly for the web; mostly frontend.