Aggregates and Sagas are Processes

The 3rd stop on my fact-driven journey

  1. Business Rules
  2. (Domain) Commands
  3. Aggregates and Sagas are Processes
  4. The Distributed Execution of Processes

Going to ExploreDDD Denver 2018 and presenting my thoughts about events, commands and long-running services has really been a cool personal success for me. :-) After my talk literally nothing (nil, zero) happened! Neither was I asked a single question, nor did I get a single mention on Twitter! :-) Nick (you are an awesome combination of Martin Fowler and Eminem) made me smile, though … because he caught me with a worried and bad face and observed - I was going on the stage to the music of Michael Jackson: Bad!

Michael Jackson in Vienna, 1988 (Zoran Veselinovic — CC BY-SA 2.0)

Now I could of course think such obvious silence surrounding me is a really, really bad thing. But I believe it mostly just meant that I caused quite a few people to think with me, and digest. While I cleared my lectern, people slowly started to come forward and ask very good questions, like e.g. if this very last consideration of mine was “command sourcing”? But no, it is not! And then later, step by step … quite a bit positive feedback popped up on that really cool conference feedback wall. On a personal level I was approached by many more people - not only after my talk, but in particular also all through the second day. Many said something along the following lines: “Martin, your talk keeps my brain spinning and exploring several of my assumptions. Thank you so much!” Furthermore I got quite a bit nuanced feedback which allowed me to understand “my problem” even better and make progress. And then one person said: “Martin, be careful, you sit on a powder keg … could explode at any time!” I unfortunately do not remember the exact metaphor anymore, I believe it wasn’t a powder keg, but sounded way cooler! :-) Anyway, I’m not afraid… such an explosion won’t kill me, after all. I loved all of your statements and inputs and thoughts, so let me just say: thank you! You are awesome! And all of you contributed to my happiness!

Now I am sitting in an Austrian Airlines plane from Chicago to Vienna … upgraded for a very small amount of money to a “premium economy” class, which is really awesome, too :-) … and started to write another blog post. While being shot back to a quite narrowly bounded alps inhabitant’s context at a speed of more than 1100 km/h currently :-o … I will try to express: aggregates are not only “working class”, processing commands and delivering outcomes, but can be modeled more like responsible process owners. A domain model can be compared to a set of interrelated processes, something I often hear Mathias Verraes saying. It’s only that because sagas are in my mind processes, too, I think this brings me to the conclusion that aggregates, sagas and long-running business processes are identical for all practical purposes, just sometimes leaning more to the working side of an equation, sometimes more to the managing side. This in turn could lead to a simplified programming paradigm … and maybe even to a sort of fact-driven architectural pattern that could e.g. turn out to be very well suited for a world of interacting serverless functions going to sleep after every single invocation.

The command handling unit of consistency

This is how Alberto Brandolini’s EventStorming sticky notes depict what happens to an aggregate. I think it represents very well how we developers think about a domain method invocation and it translates particularly well to the idea of an eventsourced aggregate. Because this is how such a method in principle could look like, e.g. in Java:

public List<Event> process(ClearDishWasherCommand command) {
return List.of(new DishWasherClearedEvent(...));
}

I associate the method parameter with a command object representing intent and being fed into a unit of immediate consistency known as aggregate. This allows me to safely check for business rules and constraints applicable to the operation, observe true local invariants of the aggregate and make some business decisions. I return a list of event objects representing the relevant change that actually happened as a result of the command execution and that is now meant to be stored in an event store.

“My problem”? Intent.

My problem is that a command could trigger a long-running execution of further tasks, in other words it could trigger further commands which are then asynchronously delegated to be carried out by somebody else, other aggregates. Likewise, the outcome of listening to events can be further events, which is something that is easier to accept, because depending on your specific paradigms and environments, it already may happen in sagas or processes. I will do a detailed analysis of this problem and explain a possible solution in the rest of this post.

In other words and illustrated with stickies: the orange sticky here saying “Outcome” could alternatively also be depicted as a blue one: the outcome would then be the intent that another party should do something, that we delegate some work. The difference between wanting something to happen in the future and having achieved something in the past is a fundamental difference when building software. And of course, this is not just a problem of colors, but has real consequences for how we think about domain models and for the way we design and build.

Events do not carry intent — of course not!

Events do not carry intent. It is not only suggested by the colors of sticky notes (blue intent is different from orange outcome, hence orange is not intent, right?), it is suggested by many, many explanations out there about the nature of commands as respresenting “intent to change” on one hand and domain events as representing (caused) domain relevant change on the other hand. On top, this thinking is also supported by the fundamental ideas of event driven architectures and the nature of pub/sub protocols, as often used to inform about events. We associate this kind of information exchange with a publisher who just informs, but does not care about the existence or number of subscribers at all. In other words: the publisher does not associate intent with such events. This thinking is also hardened a bit by the ideas of event sourcing, because aggregates almost always model some real life aspect going through a lifecycle of events, so the only thing we need to reconstruct the current real life “state of affairs” are all events in the sense of already “achieved” outcomes. Last not least: we label domain events in the past tense. But with past tense our natural language describes things that happened in the real world surrounding us, but it doesn’t describe so well that our brain came to the conclusion to want something. Because even though that particular fact truly happened in the past, too, we then typically use imperative mood to transport our intent to the outside world, in particular when we want it to be carried out by somebody else (only exception probably being when my wife notes that “the dishwasher finished”). So again: by labeling a domain event in the past tense, we suggest that an event does not carry intent, but should “only” describe a result of intent execution.

I am absolutely fine with all that.

I observed however that this notion is actually not in line with Greg Young’s definition of domain events as being “something that has happened in the past”. Because? When I say “stand up!”, and you don’t, even though my intent was not carried out by you, it was expressed by me! And that is “something that has happened in the past”. Is this just nitpicking? I don’t believe so.

Why is intent a problem and why is it relevant?

As I said above, the problem is that the outcome of executing an intent can be further intent. For example, when my wife uses a more explicit language and asks me “Could you please clear out the dishwasher!”, I could be tempted to turn to one of my (lovely!) daughters and ask her: “Dear, could you please help me to sort the cutlery parts into their box!” In other words, I would now decide to delegate some of the work to somebody else.

To delegate to other components is of course something that we use in programming all the time. We have a component which is supposed to do something, but, maybe just under certain conditions, it could decide to delegate some work to some other component. The simplest case could be a class which is delegating to another one or a function which uses another one. We do however in DDD currently not have a clear-cut, explicit notion of asynchronous delegation to other aggregates and that will stay as long as we say that a domain event in the sense of being something without intent is the only outcome of processing a command.

Because, I, being an aggregate of cells of my own, can now not directly delegate to my daughter aggregate, even though I wanted, but I’m required to say: “I cleared the plates and cups!” This would of course be inconvenient for my wife, because her original request to clear out the dishwasher is not yet fully executed. Eventually she would therefore decide to turn to our daughter and ask her to “please sort the cutlery parts into their box!” In software terms, this would probably be an overemphasis of an orchestration paradigm, because my wife now is actually more in a role of a saga or process manager reacting to my domain event and telling another aggregate what to do next in order to achieve her original intent. But she would really prefer that I responsibly take care of that part of the operation, even if that means that I outsource part of the work to my daughter. In fact she’d appreciate it even more, if I supported a choreography of events by autonomously reacting to a “dishwasher finished” beep, because then she would not even have to issue any command anymore. My daughter on the other hand prefers to listen to a well-defined command only instead of supporting a choreography of events by directly listening to my “plates and cups cleared” event. While that would be a possibility, I actually also prefer that we two are in a sort of customer and supplier relationship: I specificy for her, how exactly she can help me and she carries it out nicely. When she tells me she “sorted the cutlery parts into their box”, I will close the overall “case” I am really responsible for myself. If I don’t hear the event, I’d eventually need to follow up after some time. Fair enough for me: I orchestrate that part of all tasks and the overall workflow! Because otherwise my daughter would need to listen to all sorts of (grumpy?) events I emit and we would eventually end up in a mess of implicit communication and all sorts of hidden expectations.

And I say no to that! :-) Being an austrian, I have read my Paul Watzlawick. Aside many other smart findings, the Austrian-American family therapist, communication theorist, psychologist and philosopher suggested to make a distinction between symmetrical and complementary interchange. And according to him, symmetrical interchange is an interaction based on equal power between communicators. Whereas a complementary interchange is an interaction based on differences in power.

At some point as a developer I increasingly felt a power asymmetry between saga or process managers and aggregates - often being understood as workers only obeying to commands. I also more and more sensed an asymmetry between the ability to communicate domain events on one hand, but the inability to communicate intent, hence something like a domain command on the other hand and I started to look for solutions. In parallel I also started to think about how one could use event sourcing as a persistence mechanism for sagas and did not find much I found really useful.

Our little family “as is”

Here is how our family currently works: when the dishwasher is finished, my wife Kathi listens to that event and dispatches two commands: ”Clear plates and cups” is directed to me, ”Sort cutlery” is directed to my daughter Fabia. When both commands are executed and accordingly events are raised, Kathi is in a position to conclude that the dishwasher is cleared. One can clearly see that Kathi is in a position of a saga or process manager here. And I can tell you she is not really happy about it at all. Now, it is often said, that while aggregates process commands and raise events, for sagas and long-running business processes it is (mainly) the other way around. One can see this pattern easily when looking at the picture.

“Factsourcing” sagas and processes?

But even though that is pretty clear, to my best knowledge nobody suggested so far to use eventsourcing for sagas and processes “the other way around”: by processing incoming events (or other facts), we come to the conclusion that we want to issue e.g. commands. I happily stand corrected, if this has been suggested before and I can give more proper credit here. This is how the code for the “event”-sourced saga method processing commands might look like:

public List<Fact> process(DishWasherFinishedEvent event) {
return List.of(
new DishWasherClearanceStartedEvent(...),
new SortCutleryCommand(...),
new ClearPlatesAndCupsCommand(...)
);
}

The persistent long-running saga or process, which is reconstructed in-memory just as every other event-sourced aggregate is processing incoming events and stores its own private state information as events. But it may also return commands which need to be processed as next steps. I presented the idea at ExploreDDD 2018 and simply suggest to consider it and investigate it further, best in a collaborative manner. More experiments need to be done, very first ones seem to be promising. This is not really astonishing to me. Because technically a long-running saga or process treats both events and commands in exactly the same way: they are really just semantically very different. The underlying event store (or should we better call it fact store when using such an architectural pattern?) does also not change at all. But while the event represents the “real life” historical fact that a new dishwasher clearance process was accepted, the two commands represent the historical fact that the saga or process created intent “in the past”.

I kind of sense that this approach could open up a universe of further possibilities, a few of which I started to list in the overview about my fact-driven journey. Just a very first one is that the command, just as every normal “event”, can now be routed to the appropriate command handler and will be processed there. This could happen of course as a side effect of storing the historical fact that intent was created in the fact store.

Note however that it could also happen in a more classical way. Domain commands do not necessarily “have to” be locally persistent. But because I can now semantically clearly distinguish domain commands and events going out of a saga or long-running process which smells just like an eventsourced aggregate smells, I can also enable surrounding infrastructure which does not persist both kinds of messages (or even all kinds of other messages), but chooses to deliver them via different channels appropriate to the use case.

Domain code with which an aggregate expresses intent as domain commands and outcomes as domain events can be the same, regardless of how the infrastructure chooses to fulfill non-functional requirements.

(Note: this is a clarification update I added when publishing the follow up post about the distributed execution of nested tasks on Sep 22nd 2018)

For both cases, there is really no special magic involved here. Also, please note that I am not overly eager to coin terms like fact sourcing, but I just look for a language which describes best what I need to speak about here. I hope that we, as a community, will eventually get that stuff consistent. We would however really need to collaborate more closely. I am all open.

Our little family as it “should be”

This is how my wife would like to organise our work. She has the possibility to tell me to “please clear the dishwasher”. But actually, this is really just a very last resort for her. On a regular basis she would really prefer that I listen to a “Dishwasher finished” event myself, which I promised to most often do. I would however, whenever my daughter is available and when she does not have more important obligations I need to support, really like her to help me a bit and ”sort the cutlery” into the box. When she is done, I will get aware of her “cutlery sorted” event, and when on top of that I myself have finished my work (to clear cups and plates) I will raise a ”Dishwasher cleared” event. My wife Kathi may listen to it, but she may also decide to trust that I finish my work once I got going. My own work (to clear cups and plates) is for a similar reasons not explicitely shown in this diagram anymore, because it’s now really a bit more my private concern. Nobody needs to command me to get started. I eventually will decide do that myself. :-)

Now while it’s clear that Kathi still looks like a lilac saga or process manager and my daughter Fabia really like a yellow aggregate, it’s a bit unclear what I represent in the picture above, exactly. Because: to my wife Kathi I look more like the yellow command handling aggregate, while I look more like a lilac command dispatching saga or process to my daughter. This is a bit of a life pattern for me … the book I loved most as a small kid and read over and over and over again was the Little I-am-me written by Mira Lobe. She was a jewish austrian writer born in Görlitz, survived the Nazis because she was smart enough to go to Tel Aviv early enough, but returned to Vienna in 1951 and died here in 1995. She wrote more than 100 children’s books. Some of her books were translated into english and other languages, some of you could even know her. The “Little I-am-me” is an unknown, beautiful, colorful animal, which wants to find out who it is, really. :-) At the end of the book, it doesn’t need to find out anymore. It simply knows: it’s unique.

Aggregates and sagas are processes

I sense that aggregates and sagas, when being treated as described in this blog post are really just about the same: processes: they walk, from the point on at which they are created until they eventually finish through a life cycle of facts. The term we should continue to use for both flavours is in my mind: aggregate. Because this is what an aggregate is really all about: it’s a unit of local consistency. Both flavours of aggregates, the ones leaning more to a managing style and the ones leaning more to a working style need just that: local consistency to make sure that they really base their decisions on their own local knowledge. This is how some of the code for “fact-driven” aggregate methods processing an incoming fact and generating outgoing facts might look like for the “should be” scenario:

public List<Fact> process(DishWasherFinishedEvent event) {
return List.of(
new DishWasherClearanceAcceptedEvent(...),
new SortCutleryCommand(...),
new ClearPlatesAndCupsCommand(...)
);
}
public List<Fact> process(CutlerySortedEvent event) {
if (cupsAndPlatesAlreadyCleared()) {
return List.of(new DishWasherClearedEvent(...));
} else {
return List.of(new DishWasherCutleryFinishedEvent(...));
}
}
...

As the exact sequence and time of messages coming in is never predictable, aggregates always make their decisions based on what they currently know. Carried out work and milestones achieved, recorded as genuine events as well as potentially also domain commands are such local facts, they just always need to be treated according to their exact, well-defined semantics. It might therefore depend on timing, outages and many other things how processes move on exactly, but this is really just life: the moment a historical fact was created, be it relevant for everybody, or yet just intent in somebodies mind: it is a historical fact that cannot be changed anymore. But the moment others get aware of those facts, new facts could have been created already, which influence whether and how somebodies intent can actually be carried out.

Variants of aggregate behaviour

When fact-driven aggregates carry out work for others (because they are commanded to do so), they record that as a historical fact, a domain event. When they carry out work mostly due to their own interest or style of operation they react to an event and they also record the work carried out as a domain event. When they delegate work to others, they indicate the creation of their intent, in other words their command, as a new fact. An interesting option to carry this intent forward could be to technically treat commands just like events and store the intent in a “fact store”. Other options of course could be that the infrastructure takes care of commands in a more transient fashion. Furthermore, when fact-driven aggregates are informed that others carried out the delegated work (because they react to an event emitted by them) they also record what they do: they could e.g. conclude that they achieved a more coarse-grained task they were working on and record that as a domain event. When they are informed that others did not carry out delegated work or not exactly as needed (because they receive a fact notification indicating that) then they indicate what to do next and e.g. come up with new commands.

What’s next?

As I have pointed out in my last post, I am tempted to call such commands domain commands, because they are a true and full-fledged part of a domain model, potentially even recorded together with domain events. As I noted in the overview to my fact-driven journey it is to be expected that in a full-fledged “fact-driven” domain model more fact types than commands and events “want” to be handled and potentially persisted. By systematically recording incoming facts via causation ids used on new facts and by systematically persisting queries, reports and other thinkable facts one could enable a kind of “total” idempotency ... of course we will face trade offs here in between non-functional requirements such as extreme robustness (as often being the primary concern in my contexts) and extreme efficiency.

Anyway, I deliberately didn’t care about idempotency yet in examples above.

Thank you Eric Evans and Pat Helland

Eric Evans and Pat Helland are true geniuses for me, and I say this even though I normally really avoid to use the term “genius” by all means … because: I genuinely believe we are all just “human” problem solvers, and tend to find ourselves in very different spots of life and places on this planet and have to figure out very different problems. It’s the problems we are exposed to that largely shapes our abilities as problem solvers … and it’s therefore always just “together” that we can come closer to what we really need. I want to acclaim Eric Evans, because he is the creator of Domain-Driven Design. And this is stuff for me which is simply here to stay! And I want to acclaim Pat Helland, of course because of his thoughts on our “life beyond distributed transactions”. On top of that I believe that he influenced my thinking since a long time and I cannot judge anymore in how many ways.

This post is the 3rd stop on my fact-driven journey

  1. Business Rules
  2. (Domain) Commands
  3. Aggregates and Sagas are Processes
  4. The Distributed Execution of Processes

A small, but important thing … even when I sound assertive sometimes, it’s a subconcious technique I use to provoke your associations and counter arguments. I never claim to understand, but always try to understand better than before. I share thoughts and feelings, and even when I attempt to “define” things, it’s an invitation: please help me, as a software community, to eventually get this stuff consistent! :-) If you want to keep in touch, you can follow me on Twitter. It’s also the simplest way in case you need to exchange a direct message with me. Please also help me to give proper credit to persons which might have contributed in some way or published something without me realising. It’s not always my strongest side to know why I know.


Thank you for reading and spending some of your precious time with me. For me it’s the most precious thing you can share.