Asynchronous workflow pattern

The asynchronous workflow pattern, also known as the publish-subscribe pattern, is an architecture pattern which is typically used to asynchronously perform resource intensive and time consuming tasks. To separate the request from the task itself we can use a queue where the sender puts messages that another service can pick up.

This pattern is a subset of the CQRS (Command-Query Responsibility Segregation) pattern. CQRS defines a clear separation of a command and query model [MF-CQRS], while the asynchronous workflow pattern only defines a command model without caring how the result of a command is being read.

Taking a queue, a sender and a receiver as a basis, we can distinguish between the following variations:

  • Many to one (also called fan-in). There are multiple senders but only one receiver.
  • One-to-many (also called fan-out). There is one sender, and one or more receivers.
  • Many-to-many. There are multiple senders and multiple receivers working through the same queue.
Multiple publishers send messages to the topic (queue) with multiple subscribers reading from the same topic. (Image taken from AWS website)

When to use and why is this useful?

We can use this pattern in any scenario where we can expect that certain tasks need to be done asynchronously. Some examples:

  • Video or image service: A user uploads a video to a video platform service (like YouTube). The original video gets stored somewhere and a message is put on the processing queue. A backend service picks up the message and optimizes the video. When done, the video is displayed to the user to view.
  • Webshop: A user finds an item to buy and places an order. A record for the order is saved and a message is put on the order queue. The order service picks up a message from the queue for any further processing.

This separation is useful for several reasons:

  • We get loosely coupled services. This makes the senders less constrained to the receivers. In the example of the webshop this practically means that we can replace any processing service (i.e. fulfillment, payment etc.) at the backend independent of the sender from the front-end layer, and vice-versa.
  • We get a responsive user experience. Take the video service as an example. When a user uploads a video, we can immediately show a message that the video is being processed and report back as soon as the video has been processed.
  • We’re able to parallelize work with multiple receivers. This means that we can scale our solution, given that any processing of the messages on the queue can be done in parallel. In the example of the video service, we will be able to spin up multiple processes in parallel each taking care of one video the user has uploaded.

You don’t want to use this pattern in case you need requests and responses to be synchronous. Or if you (only) perform reads (i.e. a movie database), then the pattern is not that useful.

Additional challenges to take into account

Implementing the asynchronous workflow pattern can make the overall architecture more complex if you consider the following cases.

Service failures

What if one of the tasks that is picked up by a receiver fails for some reason? If we take the example of a video platform, someone might upload a video that is larger than you anticipated, or maybe an unexpected error occurs while trying to encode the video in a different format. Or maybe the hardware the process is running on dies unexpectedly. In all these cases you need to think about how to deal with such errors. Fortunately, we can use the Node failure pattern for failure detection and recovery when subscribers fail to perform tasks.

Inefficient resource use

If you have a fixed number of receivers waiting for tasks to pick up, there can be periods of low message volume which means our services are sitting idle and doing nothing. On the other hand, you might get an overload of messages such that the receivers you have available cannot keep up. In this case, we can make use of the Automatic Scaling pattern so that we can automatically scale our receivers based on our workload.

Managed message queueing services in the cloud

We can use managed message queuing services from the cloud to apply this architectural pattern. Using a managed solution has the advantage that a lot of things are being taken care of. Another benefit of using such a service in the cloud is that you can integrate with other cloud services more easily.

What’s up next?

As part of the blog post series we’ll discuss in one of the next blog posts how to deal with the described problems when using the asynchronous workflow pattern. In addition, we’ll also have a more in-depth comparison of managed queuing services of major cloud providers (GCP, AWS, Azure and Alibaba Cloud) with concrete implementations.

Stay tuned. More will follow soon.