State- vs event-based web architectures
This article is the first in a series and covers a high-level motivation for event-based persistence and business logic.
Web development — current state
Most web development projects start with a minimal set of functional requirements, and a lot of pressure to build something that can satisfy those requirements. State-based MVC frameworks like Rails / SpringMVC / ASP .NET MVC have been developed to streamline this delivery process. Lean development methodologies stress getting working software into the real world as quickly as possible.
However the shift in web development from data-driven web sites to behavior-driven web apps, and the additional requirements that this shift brings (see below for a breakdown of these requirements), means that the near-ubiquitous state-based MVC approach is often a poor fit.
A typical initial development phase — let’s use a product catalog developed in Rails as an example — will focus on defining and persisting data models, putting in a presentation mechanism to present this data to the user, and writing business logic to determine how this data is updated. Lets presume that your team has deployed a web application that minimally satisfies the MVP functional requirements — so far so good. I now want to unpack a typical trajectory as the MVP hits the real world.
Requirements from the real world
- You get a tricky bug in production, and realize you don’t have visibility when things go wrong. You add in logging, with a persistence mechanism for those logs, and a way to view them.
- In discussions with the design and product teams, you realize that you don’t have visibility on how users interact with the application, and how those interactions support the broader product goals. You add in an analytics library with event tracking to get the necessary insights.
- A support ticket gets opened for a piece of data — say a product name — that was changed but shouldn’t have been. You realize that you don’t have visibility on how the data got into that state. You add an audit log, with a persistence mechanism, and a way to view that history.
- When building out the reporting functionality, you realize that you need the ability to pull reports for historical data, but the application only stores the current state. You implement a data warehousing solution to persist a full data history, and a way to generate reports from this history.
- Support requests start coming in because certain UI elements don’t update when the underlying model changes. You implement a messaging system so that components can reload their state and caches can be flushed when data is changed.
- You hire a data scientist, and need to supply them with training data for a recommendation model. You realize that the data you are persisting is not rich enough to train an ML model. You add extra code to persist richer interaction data and make this available to the data scientist.
By this stage there are six key requirements that were not intrinsically supported by the state-based approach, and you have had to add extra logic, persistence mechanisms, and presentation mechanisms to support them.
Building on events
The key insight here is that real-world requirements for web applications are generally better accommodated by building on top of a high-quality event history, rather than on top of the persisted current state of the data. All six of the examples above are either events / event histories themselves, or can be directly derived from an event history. With an appropriate design the application state can always be derived from the event history (event sourcing is an example of this).
Focusing on one high-quality event history (rather than a data model with multiple secondary event persistence mechanisms) will result in less code, fewer things going wrong, and higher-quality data for all data consumers — end users, developers, support agents, marketeers, data scientists, etc.
If it’s better why isn’t everyone doing it?
At a pragmatic level, state-based MVC frameworks like Rails facilitate the rapid delivery of MVP products, and this makes them attractive at the inception stage of a project. From there teams can become ‘boiling frogs’ — incrementally adding complexity without assessing whether the fundamental approach is appropriate to solve the real-world problem.
In terms of current trends, reactive / event-based web programming — from the event loop in NodeJS through the reactive manifesto to frameworks like Spring WebFlux and Akka — is increasingly popular, but web applications built on these principles often have state-based persistence and business logic (this is a typical example). My point is that truly reactive web applications should have event-based persistence and business logic across all application layers. This means that events are stored, rather than state, and business logic maps input events to output events, rather than mapping commands to state changes. This can be seen as a natural result of combining domain events with functional reactive programming.
At a more philosophical level, developers still approach web applications from a command-and-control perspective— the application needs to be told what to do, and coded to do it. As applications become ever more complex, and machine learning plays a bigger role in these applications, an autonomous agent approach will become more appropriate — tell the application what is going on, and have it generate responses that optimize some measure of value. This has parallels with agile team dynamics — moving from telling teams what to do, to giving them appropriate information and allowing them to find their own solutions.
Making this concrete
In upcoming articles in this series I will cover implementing this approach in a web architecture.