Distributed Orchestration with Camunda BPM, Part 1

Simon Zambrovski
Oct 21, 2019 · 9 min read

Automation of business processes is the core domain and one of the purposes of process engines. Camunda BPM being a very popular, lightweight open-source process engine is not an exception. It executes BPMN 2.x directly and provides very high coverage of the BPMN 2.x standard.

In the last few years, Microservices became one of the utmost important trends in modern software industry. I’m not going to talk about pros and contras, chances and pitfalls — I just observe a shift at my customers’ projects towards thinking in separated contexts instead of monoliths. My effort is usually on creation of a good-enough architecture for managing those contexts. Especially, Microservices (and reactive systems) gave a huge push to an old, but important approach of Domain Driven Design (DDD). Independent from distribution, it fosters creation of Bounded Contexts and dealing with integration between them. As a BPM craftsman, I focus on automation of business processes executed in such environments.

For me, the essence of change conducted by those trends today is about two features: Autonomy and Agility. From the point of view of BPM, especially autonomy creates additional requirements for the Process Modeling and Process Implementation. Autonomy itself has no strict requirement on distribution, so we can discuss this topic even without distribution, but in a context of Business Processes implemented along Bounded Contexts.

Two terms in Business Process Management are directly related to autonomy: Orchestration and Choreography. Classically, Orchestration denotes a guided execution inside one closed/bounded context conducted by some entity, in contrast to Choreography, which is more about different multiple closed/bounded contexts exchanging messages, by knowing on what message to react how.

Orchestration is well-understood inside a single Bounded Context. The topic has been born in the 1970s and had enough time to evolve, both in academia and industry. Standard languages like BPMN 2.x define operational semantics for the orchestration and modern tools like Camunda BPM implement it in a great manner.

On the other hand, the idea of Choreography is clearly defined in distributed contexts. Having independent Bounded Contexts implemented as self-contained systems, managed by different teams doesn’t allow tight-coupling between them. In order to achieve loose-coupling, messages are used to define the API between those systems. Messaging systems provide asynchronous delivery fostering location transparency and independence of availability.

Unfortunately, there are only few techniques to implement orchestrations in industry — a popular one is to follow Event Driven Architecture (EDA) by using a common messaging infrastructure / event bus, such as Apache Kafka or RabbitMQ. Bernd Ruecker from Camunda pointed out the weaknesses of this solution regarding complexity, scaling and evolution in his post. In short, it works in small scenarios and create high coupling between single systems / components. Bernd proposes to use orchestration instead of choreography and tame the chaos and I generally agree with him.

At the same time, just replacing choreography by orchestration doesn’t solve all problems. I would like to share my thoughts about this, some patterns and some implementation experiences.

Decomposition patterns in BPMN

In short, the DDD term “Bounded Context” is about the scope with responsibility and same language, which is primarily used to decompose the Domain in order to cope with limitation of size, complexity and integrity. BPM uses Decomposition of business processes for the same reasons — to make the parts easier to manage, maintain and change.

By decomposition I mean a Process modeling approach transformation of process model into another process model which is equivalent in terms of operational semantics (=meaning).

Example Process

Here is a small example process used to demonstrate some features.

The process Order Management is started on every order created and orchestrates several Business Capabilities with a goal to to fulfill the order. It starts with checking the inventory, continues with payment and finally triggers the delivery.

Call Activities

The simplest decomposition pattern in BPMN is the usage of call activities:

In this simple example, the Order Management process delegates the execution of Inventory Check, Payment and Delivery to three processes. The execution of the Order Management is stopped on the call activity until this terminates and continues then. From the point of view of scoping and visibility, it is irrelevant for the Order Management how those are implemented, as they are considered as black boxes.

In Camunda BPM the call activity also defines an API between the two processes by providing variable in- and out-mapping. This allows for mapping the variable scope of the caller process to the variable scope of the callee by passing, renaming or creating new variables and then defines the inverse mapping for the result of the execution.

In addition to the variable mapping, the call activity may have boundary events attached, influencing the execution and the possible return paths. Here is an example, in which payment and inventory processes may terminate differently:

Messages

Another popular approach to delegate the execution is the usage of BPMN messages. In contrast to call activities, the execution of the caller process (Order Management) is not blocked during the execution of the called process. This creates real concurrent execution on conceptual level, which may result on concurrent execution on technical level. If we are aiming to provide the same semantics using messages as provided with call activities, the throw / catch message event pairs must be used. Here is an example:

Since the Order Management delegates the entire logic to other processes it only serves as a pure orchestrator, used to translate / mediate between different contexts. The process model becomes more interesting, if the exceptional responses are included:

Compare this to the call activity example. There is no intermediate catch-error event which can be mapped to boundary error event, so a message must be used instead to indicate that “no goods are available”.

Implementing decomposition patterns

Using Call Activities in the same engine

Camunda BPM provides first-level support for call activity implementation. Everything you have to do is to deploy both process models in the same engine and specify the in-/out-mapping in the process model of the caller. If “all”-mapping is ok, you are done, but it is a better idea to limit the meaning of process variables to one process model. If your decomposition is done along with the boundaries of the business context / business capability, you should define a mapping in a translating entity to enforce decoupling of the process models and contexts. Call activities are completely handled inside the Camunda engine and no custom code is required. Even the exceptional paths introduced by BPMN error thrown by the called process and the timeout don’t need developer intervention but can be defined in the model.

There is an important detail that the API (remember that this consists of the variable mappings and boundary events) is completely defined by the caller process. Using Vernon’s DDD bounded context relation types I would classify this relationship as Conformist. If the variable mapping is provided, it establishes an Anti-Corruption-Layer (ACL) between the processes.

In fact, for the execution of the called process there is generally no difference if it has been started by another process. Thanks to Camunda API, you can find this out analyzing the execution graph.

Using Message correlation inside one engine

Things get a little more complicated if we use BPMN messages for communication between two processes deployed in the same engine. Camunda BPM supports Message Catch events as first-level elements which work without additional code (the engine will create Message subscription and correlate against this, if the message event occurred), the catching messages is for free. Sending Messages (Message Throw events) works using Camunda Messaging API only. The RuntimeService provides a bunch of methods for correlation of messages with running instance (waiting in message subscription) or starting a new process instance by message.

The API definition between the caller process and called process is based on names of messages. Since they must match, the relation type can be classified as Partnership.

At the same time, variables can be set (and usually are set) during message correlation, which means that the internal state of the called process is changed by the caller process. The feature allows smooth integration between processes running in the same bounded context, but should be used across context boundaries thoughtfully. This pattern introduces a violation of the encapsulation principle, in case of simulated call activities — in both directions. This can be avoided by creation an Anti Corruption Layer (ACL), which consists of the following rules:

  • Correlation must not set (global) process variables
  • Correlation may set local process variables
  • A special ACL listener attached directly to the Message Catch event element translates between those local variables and variables of the receiving process

This solution leads to a better decoupling of process contexts, but at the costs of higher technical complexity.

Experiencing concurrency problems inside one engine

In addition, the implementation of such processes in the engine is not trivial regarding the concurrency and timing. Imagine the following process model snippet taken from the previous example using message correlations for process decomposition:

The message throw event of the Order Management would have a delegate implementation (simplified and written in Kotlin):

This code will NEVER work if there is no async on Start event of the Inventory Process or Check Inventory service task. This happens because the call of the Correlation API will cause to run the entire Inventory Check process in one transaction BEFORE returning and it will fail in correlating the resulting message, since the Order Management has not yet got the chance to create a Message subscription. So there is no concurrency issue here, but the process is not executable.

This code will POTENTIALLY fail, if you just insert the async attribute on the start event, because by doing so you are creating a race condition between the two processes and assume that the Order Management process is faster than the Check Inventory. In this case, a concurrency issue exists and is resolved by Camunda BPM implementation based on many factors like Job Executor, etc…

There is a solution working in Camunda BPM by the fact, that the parallel gateway is implemented in a certain way. Here is how to model it:

This time, the async before on the Message Throw event will force the engine to send event in the next transaction, but create the subscriptions to the messages in the same transaction. Since we receive messages in the branch of a parallel execution, we need to join and then decide on the outcome. As you can see, the resulting simple message exchange from Call Activity example is polluted in both, the BPMN model and the technical implementation.

Conclusion

In this article, I shared my thoughts on usage of business processes in context of Microservices or simply Bounded Contexts. I see two major BPMN patterns how decomposition of the business process into smaller parts can be achieved:

  • using call activities
  • using message correlations

Implementations of those patterns even on a single engine or homogeneous cluster (which is fine to get the system failure safe, deploys same software on all nodes that share the same database) lead to some conceptual and implementation challenges:

  • Using call activities establishes a “conformist“ relation between the orchestrating process and the called process. This is especially a problem if you want to avoid business process monolith and have a large orchestration process calling many small processes.
  • Message correlation inside Camunda BPM is (intentionally) built on the approach violating the encapsulation principle. This feature is very helpful in monolithic, coherent context or just inside one bounded context, but is a limitation if applied on communication between different contexts. It can be avoided at the cost of higher technical complexity. In addition, it is not easy to implement regarding concurrency issues, which come at the cost of pollution of the process model with implementation details.

Both results are not surprising. BPMN is not designed to express or address issues related to distribution or concurrent execution. At the same time, Camunda BPM is a process engine, historically designed to be a central orchestrator (the “shared engine” operation model is still available and is used by customers). It is not supporting or addressing any issues from distributed systems and especially Microservices, but still can be used in this environments successfully, if certain guidelines are applied.

Holisticon Consultants

voices of holistikoenner/innen

Holisticon Consultants

Promise! We listen attentively. We understand. And we do what we’re best at: honest technological and methodical management and IT consulting. Every day. With passion and talent. And the best people that we can find.

Simon Zambrovski

Written by

Senior IT-Consultant, BPM-Craftsman, Architect, Developer, Scrum Master, Writer, Coach

Holisticon Consultants

Promise! We listen attentively. We understand. And we do what we’re best at: honest technological and methodical management and IT consulting. Every day. With passion and talent. And the best people that we can find.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store