Implementing Finite State Machine as Python Language Native Construct

Django Viewflow
The Startup
Published in
8 min readJan 29, 2021

The story about how I worked on a library of 1000 lines of code for 10 years.

In this article, I’m going to share my development experience with the Django-FSM project. I will describe the process of building pythonic API, discovered pitfalls, and implemented a generic Finite State Machine library.

I have never been connected to any other project for so long. When I started developing Django-FSM, I couldn’t even expect that such a small piece of code could be developed and rewritten from scratch several times over the years. Each time I discover new features and possibilities and become a better software developer.

Introduction

Finite state machines (FSM) formal specification explains and constrains a program’s behavior. State machines support writing self-describing code. It helps to detect logical errors and, if something goes wrong, it provides exact exception messages.

To represent a finite state machine, we need to specify a set of states of our software and a set of transitions from one state to another. If there is no transition arc between the states, the state change is not possible.

For many things in Computer Science, there is a small confusion in the definition and usage of the FSM term. There are two kinds of state machine API; both are far from the academic view of FSM.

One is designed to handle the incoming uniform event stream and is usually used to parse incoming data or handle user input. The state here is private and newer, directly changed by an external caller.

parser = ParserFSM()
for char in '2*(1+3)':
parser.next_step(char)
parser.calc() == 8
StackOverflow

Another commonly called FSM approach is a restriction on a possible state variable value change with no generic input.

Stateless 3.0 — A State Machine library for .NET Core

Such state machines do not have a single next_step method available. Each event has custom data and can produce different side effects. This approach is used in the systems, where you can bind a specific state machine method to a separate event handler or HTTP endpoint. It is very common, in various workflow automation software.

Django-FSM

Over ten years ago, I made the first 70-lines to commit to the Django-FSM repository. The project goes to provide a non-homogeneous API and to be as pythonic as possible.

The general idea was to think of FSM as a way to write `pre-` and `post-` conditions structurally for a class method.

from django.db import models
from django_fsm import FSMField
class STATUS(models.TextChoices):
OPEN = 'open', _('Open')
ASSIGNED = 'assigned', _('Assigned')
CLOSED = 'closed', _('Closed')
DEFERRED = 'deferred', _('Deferred')
class Ticket(models.Model):
text = models.TextField()
status = FSMField(
choices=STATUS.choices,
default=STATUS.OPEN,
protected=True
)
@transition(source=STATUS.OPEN, target=STATUS.ASSIGNED)
def assign(self,user):
self.assignee = user

Take a look at the Django Model class. States of the class restricted by STATUS enumeration. The @transition decorator on the Ticket.assign instance method to specify allowed state change. The status field can only receive a limited set of values. If a Ticket instance has beenAssignedthere is no way to return it to the Open state unless you explicitly define such a transition method.

>>> ticket = Ticket(text='sample post')
>>> ticket.assign(request.user)
>>> ticket.state == 'assigned'
True
>>> ticket.assign(request.user)
TransitionNotAllowed: No transition from `assigned` state

You can think about finite state machine specification as a sort of Dynamic Typing. If you call the wrong method on the wrong object, you will get a runtime exception.

I chose Django models not only because model state constraints are something I wanted to have for my working projects. Django already has a sophisticated meta-class construction protocol for the class fields. Built-in Django functionality would allow to hide and implement whole Finite State machine logic inside FSMField class.

So, the initial implementation of a finite state machine is based on Django meta-class construction, python classes, and decorator. Can it be considered close to a native extension of Python and Django? To answer this question, it is necessary to examine how the class-based FSM interacts with other framework and language features.

Multiple state fields

We have a state field as an attribute of a model class. Could we have two state fields in the same class?

We have a state field as an attribute of a model class. Can we have two state fields in one class? This is the first problem. There is no connection between the state field and the transition definition. The initial implementation seems to ignore the important motto of Zen of Python — Explicit is better than implicit. Let’s extend the transition decorator arguments and link the field explicitly.

class MyModel(models.Model);
parent_state = FSMField()
sub_state = FSMField()

@transition(field=parent_state, source='in_process')
@transition(field=sub_state, source='start', target='end')
def do_it(self):
pass

The problem seems to be solved, doesn’t it?

Class inheritance

Class inheritance support opens up a bunch of new questions about FSM design. A common reason for а FSM inheritance is to have some base flow, and extend and modify that flow in the Child classes.

We must carefully manage the local field transitions list is not to change the transitions of parent classes when we add a new subclass. However, this is not the last problem.

The local scope allows us to refer to the state field inside the same class’s method decorators. How can we access the field from a subclass declaration block?

class Base(models.Model):
state = FSMField()

class Meta:
abstract = True
class Child(Base):
@transition(field='state', source='start', target='end')
def new_transition(self):
pass

Django model meta-class hides the real instance of the field under the private Base._meta.fields. Inside Child class declaration, we can’t just refer to the field as Base.state

Fortunately, Django has a solution for this. We could use the string name of the field as a reference, and on class_prepared signal, go through the transition methods and link a real instance of the field.

Extend and override a transition

Can you spot a problem here?

class Base(models.Model):
@transition(field='state', source='new', target='approve'):
def approve(self, user):
self.approver = user
class Child(models.Model):
@transition(field='state', source='new', target='approve'):
def approve(self, user):
self.approver = user
self.approved_at = timezone.now()

When we use a method decorator, we lose access to the original function. We cannot override and wrap transition decorator and simultaneously call super().approve(user) inside it. This would mean that we make the transition twice.

This seems to be the missing part of the python language itself! Method decoration and class inheritance super() calls don’t fit together very well.

Transition method chaining

Is it possible to combine and call one transition method from one to another?

Unfortunately, no, with the original Django-FSM implementation. The state is a pre-condition of the method, and the state change only happens when the method completes. This means that we have a state equal to new in the body of the send method, and no transition with source="sending" can be performed.

To solve the problem mentioned above, the state must be changed before the method executes. If the method terminates with an exception, the state must roll back to the original state.

class Email(models.Model)
state = FSMField()
@transition(field=state, source='new', target='sending')
def send(self):
try:
self.send_mail()
except:
self.failed()
raise
else:
self.complete()
@transition(field=state, source='sending', target='complete')
def complete(self):
pass

@transition(field=state, source='sending', target='error'):
def failed(self):
pass

Interaction with Django ORM

If the state field is protected, it cannot be assigned directly.

class Ticket(models.Model):
state = FSMField(protected=True)
>>> ticket.state = 'DONE'
AttributeError: Direct state modification is not allowed

However, every Django model has the refresh_from_db method that makes an SQL query and updates the state of the model instance. Internally Django performs setattr(self, name, value) That causes a conflict with FSMField, and there is no way to override this behavior for the field.

Many other unpleasant things happen on the borderline within Django and Django-FSM. Support for different field types, such as ForeighKey and IntegerField, which has never been implemented well. There was an issue with deferred field values, which was resolved with additional code. A conflict issue about to call .save method inside the Django-FSM code was several times arisen by users. Moreover, the django-fsm-log package by Gizmag had to do some tricks to implement generic state changes logging in case the main model was not saved in the database and has no pk available.

New approach

It is possible to write some hacks around the pitfalls recognized. However, can we avoid the pitfalls by design and keep the FSM library code small?

It is time to rethink the initial design critically. Interaction with the Django model metaclass construction seems like the most problematic and should be removed.

Single responsibility principle

Django development practice could be very successful with “Fat Models” design, for a while. However, if things get complicated, it’s always worth splitting the code.

Let’s avoid coherence with Django by separating the FSM definition into a separate class.

class TicketFlow(object):
status = State(Stage, default=STATUS.OPEN)
def __init__(self, ticket):
self.ticket = ticket
@status.transition(source=STATUS.NEW, target=STATUS.DONE)
def assign():
pass

Remember, the “Explicit better than implicit” principle discovered earlier? It helps me reveal an even better design. The transition decorator has been implemented as a state-field method. It surprisingly eliminates almost any additional tricks for metaclass construction.

Since this is not a Django model, there is no need to use string references to access the field in the Child class. Transitions could be explicitly defined with the same TicketFlow.state.transition() decorator.

Composition with Django Models

“Composition over inheritance” is probably the most breaking trend in modern API design in recent years. So, let’s allow wrapping and using any other object as storage for the FSM state.

class TicketFlow(object):
status = fsm.State(STATUS, default=STATUS.NEW)

def __init__(self, ticket):
self.ticket = ticket

@status.setter()
def _set_status(self, value):
self.ticket.stage = value

@status.getter()
def _get_status(self):
return self.ticket.stage

That gives seamless Django (and any other ORM) integration.

Transition chaining and class inheritance

A slightly advanced Python Kung Fu is needed to implement access to an un-decorated, but bound class method.

class TicketFlow(object):
status = fsm.State(STATUS, default=STATUS.NEW)

@status.transition(source=STATUS.DONE, target=STATUS.HIDDEN)
def hide():
print('base hide')

class SupportTicket(TicketFlow):
@TicketFlow.stage.super()
def hide(self):
# any additional code here
super().hide.original()

The additional super() decorator has been implemented to avoid transition data duplication for an overridden method.

Easy logging and model.save()

The way to solve a discussion about library functionality is to provide an ability to configure and customize. Logging and saving the model are actions that are performed after each transition is completed. Furthermore, the library should provide the ability to implement this.

@stage.on_success()
def _get_report_stage(self, descriptor, source, target, **kwargs):
with transaction.atomic():
self.review.save()
ReviewChangeLog.objects.create(
review=self.review,
source=source.value,
target=target.value
)

Conclusion

Blindly applying general software development principles can lead to complex and unattainable code. However, when a real problem arises, it is worth trying to solve it using known best practices.

I implemented a whole bunch of fresh ideas in the new package. Solved many issues to make the finite state machines feel like a native Python language construct. The new FSM should play well with various features of Python and other libraries.

Do I think it’s over, and after 10 years, I finally managed to write a small, 1000 line library of code well? Hope not, let’s see what next year’s reveal and what new approaches to software design gain popularity.

The alpha release of the new FSM library, available as viewflow.fsm packages on pypi

pip install django-viewflow --pre

The documentation available at https://docs-next.viewflow.io/fsm/index.html

Happy hacking!

--

--