Big BPM is coming

Javier Antoniucci
Apr 22, 2020 · 6 min read
Image for post
Image for post

Beyond a request-response, there are many use cases requiring a complex state management, like responding to asynchronous events or communicating with other unreliable external systems. The usual approach to implement these use cases is a hodgepodge of stateless services, databases, batch/cron jobs and queuing systems. This negatively affects the developer’s productivity since most of the code is dedicated to plumbing, hiding the real business logic behind a mountain of low-level and low-value technical details. Also, such systems often have availability problems, since it is difficult to keep all parts running.

These use cases have been addressed by the Business Process Management (BPM) software products that have achieved some success, to a greater or lesser extent, although with difficulties to overcome acceptable levels of maintainability and scalability.

Cadence is an Open Source project by Uber that proposes a programming model with an agnostic fault state that conceals most of the complexities of building scalable distributed applications. In essence, Cadence provides virtual persistence that is not linked to a specific process and preserves the complete state of the application, including evidence in case of hardware and software failures. This allows you to write code using the full power of a programming language like Go or Java, while Cadence takes care of the durability, availability and scalability of the application.

Cadence consists of a programming (framework) and execution model (managed service or backend). The framework allows developers to create and coordinate tasks in familiar languages ​​(Go and Java are available now, and soon Python and C # will be available through a proxy currently in development).

The backend has no status and is based on persistent storage. Currently, Cassandra and MySQL are compatible but an adapter can be added to any other database that provides transactions of individual multi-row fields. There are different models of service implementation. At Uber, they have clusters of hundreds of hosts that are shared by hundreds of applications.

Workflow

With Cadence, all logic can be encapsulated in a simple and lasting function that directly implements business logic. Because the function has status, the developer does not need to use any additional system to ensure durability and fault tolerance. The main restriction is that the workflow code must be deterministic, which means that it must produce exactly the same result for the same input, even it is executed several times. This discards any external API call from the workflow code, since external calls can intermittently fail or change their output at any time. That is why all communication with the external world must occur through Activities. For the same reason, the workflow code must use the Cadence APIs to obtain the current time, suspend and create new threads.

Below is an example of the workflow that implements the subscription administration use case:

Image for post
Image for post

This code directly implements the business logic. If any of the operations invoked (also known as activities) takes a lot of time, many instances or a lot of processing, the code will not change. There are no inconveniences if in chargeMonthlyFee the process is blocked for a day if the payment processing service is idle for so long. In the same way that blocking for 30 days inside the loop is a normal operation within the workflow code.

It also supports child Workflows (threads) that can be used to component reusable behaviors. Cadence has virtually no scalability limits on the number of open workflow instances. Even if there are hundreds of millions of consumers, the code above will not change.

Activities

Activities are invoked asynchronously through task lists. A task list is essentially a queue used to store an activity task until it is picked up by an available Worker. The Worker processes an Activity by invoking its implementation function. When the function returns the result, the worker informs the Cadence service, which in turn notifies the Workflow of the completion.

Cadence allows you to configure for each Activity different policies for Timeouts, Retries, Long running, Cancellation, Routing, etc.

Event handling

Cadence supports the aggregation of events and their correlation without intending to replace solutions such as Apache Flink or Apache Spark. But in certain scenarios, it just fits better. For example, all events added always apply to a business entity with clear identification. And then, when a certain condition is met, actions must be taken.

Many business processes involve human participants. The standard Cadence pattern for implementing an external interaction is to execute an activity that creates a human task in an external system. It can be an email with a form, or a record in some external database, or a mobile application notification. When a user changes the status of the task, a signal is sent to the corresponding Workflow.

Synchronous Query

How does it work ?

Similarly, when a workflow needs to handle an external event, a Decision task is created. A list of Decision tasks is used to dispatch to the workflow worker able to run it.

While Cadence’s task lists are queues, they have some differences with commonly used queue technologies. The main one is that they do not require explicit registration and are created on demand. The number of task lists is not limited. A common use case is to have a list of tasks by work process and use it to deliver Activity tasks to the process. Another use case is to have a list of tasks by a group of workers.

Image for post
Image for post

A typical Cadence-based application consists of a Cadence service, workers for workflow and activity, and external customers. Both types of workers, as well as external clients, are roles and can be placed in a single process if necessary.

The Cadence service is composed of Front End (FE) services that implement the API, history service (HS) that manages queues, handles events, stores and mutates workflow states and the Matching service (MS) responsible for dispatching tasks.

Moving forward…

curl -O https://raw.githubusercontent.com/uber/cadence/master/docker/docker-compose.yml && docker-compose up

I encourage you to share your questions and thoughts in the comments of this post.

gft-engineering

GFT is driving the digital transformation of the world’s…

Javier Antoniucci

Written by

gft-engineering

GFT is driving the digital transformation of the world’s leading companies. On here, our tech communities from all around the globe share their tips, tricks & insights with other developers.

Javier Antoniucci

Written by

gft-engineering

GFT is driving the digital transformation of the world’s leading companies. On here, our tech communities from all around the globe share their tips, tricks & insights with other developers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store