Azure Durable Functions Part 2

7 min readMar 15, 2019

“empty arranged theater” by Radek Grzybowski on Unsplash

This article is an extension of a previous article where I looked at the basics of Durable Functions, themselves an extension to Azure Functions and part of Microsoft's serverless offerings. We’ll be building on the basics, and assume some basic Azure Functions knowledge (documentation here). The Durable Functions documentation can be found here.

The Problem

Azure Functions allow developers to easily spin out new functionality without needing to worry about provisioning infrastructure, or building out large complex frameworks. Using triggers and bindings such as HTTP, storage, Service Bus, and many more, you can quickly integrate Functions into existing workflows or build out new workflows with minimal integration and coding. This is great when Functions operate as one-off processes, or in small, isolated, or loosely coupled scenarios.

However, what happens when you have multiple Functions chained together, where each is dependent on the output of the previous? How about when you need to do work across multiple Functions as a single transaction, where if one Function fails, subsequent Functions process differently, or maybe you have to undo previous work to maintain a consistent system? In these cases, you need some kind of state that is maintained between steps (Functions), and some form of orchestration. This need for a persistent process state and orchestration isn't just restricted to failure scenarios, most complex systems will have more than one forward path even for simple workflows. Think of how Functions could mirror business workflows, with each Function performing one step in the process. There is also the case of long running or non-deterministic jobs, where the Functions could potentially run for hours. Keeping state in all these different scenarios will require some kind of persistent store, shared context, complex messaging capability, or all the above.

The need to handle coordination, retrying, logging, concurrency, etc. falls to the developer when using standard Azure Functions. Often some kind of persistent store such as Cosmos, SQL, or even Blob Storage will fill the roll of storing a document or record for the jobs that are processing, along with any interim state. This works great, but it requires you to provide access to all the participating Functions, as well as handle concurrent access, rollback, schemas etc. etc. You will also need to manage the calling of the Functions in the correct order, via a queue or similar.

Why Durable Functions are useful

The key to Durable Functions is the Orchestrator, this is a special type of Function that will be, as the name suggests, the orchestrator for a workflow. The Orchestrator Function is responsible for coordinating the Activity Functions (aka the individual worker Functions performing an individual task), and applying a particular pattern for distributing the work (Fan-Out, Chaining, Async/start and wait, etc.). The Orchestrator will also handle errors and retries. Through the Orchestrator you can easily pass state information to each Activity Function, as well as get state and progress information back, without requiring manually passing messages on a queue, using Cosmos or similar. The Orchestrator also has the ability to rebuild its execution state (more on this later).

The last Function type used in the Durable Functions framework is the Client Function, which can be triggered however you like, and contains a binding to start the Orchestrator Function.

Demo

In order to better see the uses of Durable Functions, I’ll walk through a demo to show the following functionality:

An HTTP triggered Client Function to kick off an Orchestrator
An Orchestrator using Function chaining to call three Activity Functions (we cheat a little on the chaining part because we pass the same object into and out of each Function, but it still counts)
The ability to pass state between the Orchestrator and Activity Functions without needing to explicitly wire up a Service Bus or Cosmos
Demonstrate how the Orchestrator will replay calls when long running or async calls to an Activity Function occur.

You can skip to the code walkthrough section if you dont need to see the project setup.

Create a new project in VS (im using VS 2017 Community 15.9.4):

Select Azure Functions V2 (.NET Core) and we’ll start with an Empty Project

Once the new project is scaffolded, right click the project, and select Add -> New Azure Function

Choose Azure Function and name it something accordingly

The type will be Durable Functions Orchestration

Code Walkthrough

Now that we have a working Durable Functions project, i'm going to add the following class with all the Functions required for the Demo.

I’ll skip over the basics of triggering the Orchestrator via the Client (available here if you want a more detailed look), and instead focus on how history and state are persisted throughout the Orchestrator lifetime. In order to illustrate this, I have created a workflow that consists of an Orchestrator calling three separate Functions that will each perform some basic task, with the overall state being held in our fake job record, JobState.cs.

So to recap, the orchestrator will create an instance of JobState, then call the three individual Activity Functions, with each one adding some data to the JobState. This will allow us to see both how the Durable Function framework handles replay and state, as well as how to easily pass data between otherwise async isolated processes without needing to explicitly wire up anything.

The best way to illustrate the flow and replay functionality is through debugging. We’ll start with the Orchestrator, it will create an instance of JobState with the initial values (null, null, and false).

The next step is to call the first Function “ProcessPart1”. Notice how the Function accepts an input of JobState from the Orchestrator, triggered as an ActivityTrigger. Each Function will only modify a small part of the job record.

The Function ProcessPart1 simply sets the RequiresEmail in our job to true.

Notice when the Function ProcessPart1 returns, control is returned to the Orchestrator, but it does not continue immediately from the last call, instead it will start over and replay from the beginning of the Function, checking a history table to see if there has already been an output value from each call, and if a value exists it will load that data instead of calling the replayed Function again.

Starting with the initialization of job, we can see it holds the default values from its creation.

The Orchestrator will now move on to the call to ProcessPart1, notice it continues execution past the call without falling into a breakpoint within the ProcessPart1 Function. However if we hover over the job record, we see that it holds the RequiresEmail flag that was set by ProcessPart1, without needing to execute ProcessPart1 again. This is the Durable Function framework at work, it stores an execution history, allowing the Orchestrator to essentially “sleep” between invocations of long running async jobs, then wake up, start from the top, reload context, and move on to the next step in the process. In order for this process to work, the framework employs a pattern known as Event Sourcing, where the state of a process consists of a series of individual pieces of data, which are appended into a replayable set of steps to recreate the object at any point in the process (this is paraphrased and simplified, but there are some great in-depth explanations, like here). The Orchestrator is checking a history account that is maintained by the Durable Function framework, transparent to you.

The Orchestrator will now call into ProcessPart2. Similar to ProcessPart1 this Function modifies the job record, this time by setting the BlobPath to the string “somethingDeterministic”. This is an important point, because the Orchestrator relies on storing the output of invocations as it processes, therefore the calls should be deterministic, since the framework assumes the calls would result in the same response as it replays.

The process of the Orchestrator calling the Activity Functions, then rebuilding its progress will continue as it calls ProcessPart3, now loading the data from the calls to ProcessPart1 and ProcessPart2 out of history.

And thats it! The execution has completed, and we have our final job object fully populated from the Activity Functions.

Wrap Up

One final piece to note is that we were passing the JobState job both into the Activity Functions as well as returning JobState back to the Orchestrator. This was to help show how we can easily send shared state back and forth between the various Functions and the Orchestrator, but there is no requirement that the input and output match. Thanks for reading!