The State Of In-Application State: What No One Is Talking About
Application state is not discussed very often and can be taken for granted. First of all, nearly all applications have state, which is more or less just the current known information about a domain entity.
For example, a shopping cart that has had things added to it over time would have a current state after each item is added, including that item and all that came before. So state is really a function of what occurs an entity over time. We will often find that state in the database, but it is most assuredly somewhere.
I’m reminded of the good old days of monolithic development, when we had the ability to store in-memory state when needed. In the world of microservices, however, stateless design sometimes reigns supreme due to its perceived simplicity. But the lack of in-application state soon rears its ugly head in many cases.
Where The Wild Things Are
Why does this happen? The problem is due to contention to a single entity, which is functionally a singleton, an object that may only occur once in a system. Singletons are actually considered an anti-pattern and a bad thing for certain application services, but for finer-grained entities design a singleton is great and maps very nicely to real life things, such as “User Sean” in our activity tracking system. But due to popular technical design our entity is actually replicated across a database cluster. Since state is only contained in the database, a read of the data must be performed before a decision or action can take place.
This is where things get really bad. In this case, in a load balanced application environment, it’s possible that concurrent or tightly spaced requests to the same entity can take place–one being handled by an application instance, and others by separate instances. Therefore, state becomes a somewhat murky thing, and separate processes may end up duplicating work. See this ugly picture I drew as an example:.
Another Fine Mess You’ve Gotten Me Into, Distributed Programming
Now with the above design, let’s consider a single use case. Suppose we have a system for activity tracking for users with wearable devices, mobile phones and laptops and we wish to issue a congratulatory notification the first time a user has tracked activity on any device.
Going further, let’s suppose that the wearable device on the wrist and the mobile phone both transmit activity data at the very same time. In the image above, two load balanced application nodes service the incoming activity, and all nodes read the database (seeing there is no activity).
Therefore, each node deems it the first time. The 3rd node reads after the other two have done the write, and therefore properly does not do a write–but it’s already too late. We now have a broken domain because two or more congratulations are issued, very much confusing the user but also making a mess of the system.
And Down The Rabbit Hole We Go
When we have a stateless design, the best we can do is patchwork. In my travels, I’ve witnessed attempts to correct the above scenario with only returning distinct records from the database and living with the mess. I think I threw up in my mouth a little bit.
In that case, the only true fix was to be forced to have application logic in our database, in this case Apache Cassandra. This means that for every use case, you allow nodes to duplicate work, but for each type of thing, such as our congratulation notification, there must be a keyed collection that allows only a single write.
Now we have leaked our business logic into our database–and trust me, it only grows more hideous from there. Say hello (again) to the world of triggers and stored procedures. Oh joy!
Salvation Lies In Understanding
Now that the problem is understood, I think it’s time we discuss the solution: Akka.
Akka is an actor-based, message-driven toolkit for the JVM, providing native supervision of your actor hierarchy, enabling it to automatically self-heal for failures, dropped messages, timeouts, etc. Akka contains 12 OSS and commercial modules for making distributed systems that just work. One of these modules is Akka Cluster, which provides distributed in-application state in a robust and mature fashion.
Akka, in general, treats the database as a system of record that may be replicated; however, at runtime there will only ever be a single instance of a particular entity in the cluster at any given time, providing ordered access to the entity, such as our active user recording activity.
This is accomplished by modeling entities as persistent actors (using the Akka Persistence Module), actors that are distributed or sharded across the cluster in a uniform manner. With Akka Persistence, it takes care of all the details of of an entity or any other singleton, such as instantiating the user actor upon first use, looking up its reference in the cluster, passivating it in times of non-use to save memory and handling its resilience by moving it to another healthy node when required.
If a node containing a certain persistent actor goes down for any reason, the actor will be brought up elsewhere in the cluster, re-instantiated seamlessly. It is even possible to keep failover duplicates running if using Lightbend’s Enterprise Suite’s Split Brain Resolver feature for Akka Cluster.
Let’s take a look at what the user wearable scenario would look like with a stateful, Akka design.
In the Akka stateful example above, there is a load balancer as before, but here each of the balanced application nodes are contained in an Akka cluster. Also inside the Akka cluster are persistent, stateful actors, in this case the Sean User Actor, awaiting activity data.
This actor has yet to receive any data and therefore the first activity received would trigger the congratulatory message generation. The same situation occurs where 4 separate requests come in at virtually the same point in time but since the actor is bounded by it’s own mailbox, Node 1 (N1) wins. Now N1 is able to generate the message and save it to the database.
The subsequent messages go to the mailbox. These messages are processed in order only after the first message is fully processed and the actor’s state updated to record that the congratulations message has been processed, or something to that effect. No more messages would occur due to the other 3 requests. Now how about that!
What To Take Away From This
We have only touched upon how helpful Akka Persistence can be with regards to in-memory domain state but the possibilities are endless. Since any single Akka node can hold millions of actors and a cluster billions, you can model most anything in real-time, and because of super fast cluster communication, these interactions can happen in microseconds.
The main message, I guess, it this: concentrate on solving your business problems and let Akka do the heavy lifting.
Reactive Application Development teaches you how to build reactive applications using the Typesafe stack. The book…www.manning.com
Still chugging along with a monolithic enterprise system that's difficult to scale and maintain, and even harder to…info.lightbend.com