Conquest of Distributed Systems (part 2) — Orchestration with Actor Model

Serge Semenov
6 min readNov 18, 2018

--

St. Cloud Symphony Orchestra (photo by Kaitlin Keane)

As a Star Wars fan, Jim was imagining Darth Vader conducting an orchestra when he overheard his software developer colleague mentioning Actor Model and Orchestration. After a short introduction to the topic, they start debating as Jim really wants to try out Event Sourcing and CQRS instead. But the time is not on their side, nor the capacity to re-write the existing system as Event Souring shifts the programming paradigm.

Recap of part 1. A software development team (starring Jim) is implementing a workflow where an end-user makes an instant purchase of an item in a web store. That translates to making a payment first, then reserving the item in a warehouse, and if the item is out of stock they simply need to void the payment. Stake holders are pushing harder to deliver features faster as they need to show the product off at an upcoming conference.

Venture to Better Design

Jim and his team try hard to buy some extra time to address problems of distributed systems, which cause quite frequent distractions from daily development activities. They explain to their manager that problems they have in production are not bugs, but merely an oversight of fundamental principles of distributed computing, and the team cannot add more features without creating even more disruptions to daily work. To their success, only Jim and his colleague are granted a few days to design a more robust solution and show results that work.

Going back the drawing board, Jim and his co-worker decided to turn their simple unreliable piece of code into a more complex fault-tolerant solution using an Actor Model framework.

Unreliable RCP-style implementation from part 1

The general idea of an Actor is to handle concurrent operations by processing incoming messages one-by-one from other Actors. An Actor can create other actors, where the only way of communication between them is message passing (event-driven design). An actor can also have a persistent state — essentially a property bag that gets loaded and saved on every message processing. There are plenty of Actor Model frameworks, but most prominent ones in .NET world are Akka.NET, ASF Reliable Actors, and Project Orleans. Few programming languages natively implement the Actor Model — take Erlang functional programming language for example. However we will focus on Object-Oriented Design instead to easily represent a persistent state.

Another dilemma they are facing is a choice between the Orchestration Model and Choreography Model. Jim resists but eventually agrees to go in favor of Orchestration because it can better represent a business workflow, and that would require fewer changes in the existing system. He understands that in a purely reactive system (Event Sourcing for example) you no longer see the workflow, what makes development and maintenance harder.

Coding Time

This is not going to be joyful in particular but is crucial for Jim to make his discovery later on. They pick an actor model framework, and start implementing the business workflow step by step — every single Git commit constitutes an incremental progression.

You may skip this implementation part and go to the very end of this part of the story if you are familiar with actor model frameworks. For the clarity’s sake, we are going to use a pseudo framework with a self-descriptive language.

Step #1

The PurchaseItemCommand class represents an external request (message) to perform an instant purchase. The OrderPlacementCoodrinator is the Actor that is responsible for Orchestration between Payment and Warehouse services. At this point, it receives the command and initializes a persisted value — a randomly-generated transactionId that will carry through the lifetime of an Actor instance.

Step 1 — The actor draft

Step #2

Now we define the contract for Payment service by introducing the CreditCommand, which is sent to an Actor of Payment service to credit the end-user’s account towards the purchase.

Step 2 — Send payment request

We use locally-generated transactionId as the ExternalTransactionId, so in case of failure we can re-try and send another command with the same ID, then the receiver (Payment service Actor) can de-duplicate incoming messages in a system with the at-least-once delivery guarantee. Ideally the actor action must be committed as ACID transaction. Otherwise, the Actor may end up in an inconsistent state — e.g. state is saved, but next command (message) is never sent. Pat Helland described good ideas in his ‘Life Beyond Distributed Transactions’ article.

Step #3

Then we define the second part of the contract for the Payment service by introducing the CreditResponse, which tells us if payment succeeded or not. Our Actor must accept that new type of message to continue the workflow.

Step 3 — Receive payment response

Step #4

At this point let us handle the first ‘sad path’ where the end-user does not have enough funds on the account to perform the payment. In such a case, we want to reply back to the requestor of the Instant Purchase with an error. To do so, we can persist the reference to the requestor (another Actor) in our orchestrating Actor when we receive the very first request to perform an Instant Purchase, so we can reply later when we receive the error response from the Payment service.

Step 4 — Fail order on payment error

The Terminate method hints the end of Actor’s lifetime to Actor Model framework so that it can recycle resources — there is no further action defined in the business workflow.

Step #5

In case of successful payment, we go ahead and reserve the item being purchased in the warehouse. We also need to persist item’s unique ID and quantity in the Actor to send the reservation message.

Step 5 — Reserve item on a successful payment

Step #6

If the item is still available in the warehouse and has been successfully reserved, we can reply to the original Instant Purchase requestor with success of order placement.

Step 6 — Complete order on item reservation

Step #7

Well, if the item (or the quantity) is not available anymore, we must void previously made payment.

Step 7 — Refund on failed item reservation

Such reverse action is called ‘compensating’, and this implementation style is called ‘Saga Pattern’ which is widely used in distributed systems where atomic operations on multiple objects (serializability) cannot be achieved.

Step #8

Finally, when the payment successfully voided, we can reply to the original requestor of Instant Purchase with the item reservation error.

Step 8 — Fail order placement after the refund

Phew! This 150-line monster code must solve the problem. Otherwise, it is hard to justify one hour of coding. Regardless, the complexity of using an Actor Model framework is very high even for orchestrating just two services with eight commands and events in total. And this is only the top of an iceberg — any framework is designed to hide the complexity of underlying implementation. Think about it.

Is this the best Jim and his team can do? Will they trade the productivity for a more robust solution as stakeholders are getting more impatient?

Keep reading to the Part 3 — Actor Model Hidden in Plain Sight

--

--

Serge Semenov

‘I believe in giving every developer a superpower of creating microservices without using any framework’ — https://dasync.io 🦸‍♂️