A new look at Azure Durable Functions

Zone
Zone
May 5, 2020 · 7 min read

Zone’s principal architect, Andy Butland examines how the durable functions framework has matured over the past couple of years…

A couple of years ago I had opportunity to speak, write and work with what was then a new serverless Azure technology, Durable Functions, based on top of the Azure Functions framework for the purposes of handling long-running or multi-stage tasks. As part of investigating the technology I built a couple of sample applications using both durable functions and the standard functions framework by itself, and compared and contrasted the results.

What’s obvious is that with the durable functions approach, much of the “plumbing code” is abstracted away from you as a developer, leaving you free to concentrate on the business logic of the workflow you are looking to implement.

You can see from a “before and after” component diagram that a lot of the necessary intermediate queue and storage components are no longer an upfront concern, and we can rely more on the framework itself for the tasks of marshalling data into the functions as parameters and out again as results.

Component diagram for a multi-stage workflow implemented in a) discrete Azure Functions and b) Durable Functions.

You also get the pleasant benefit of being able to apply single-responsibility principle to your functions — stringing together what can be a complex workflow from a series of individual components, each focused on their own specific task.

If you’re interested in reading more on this comparison between Azure functions and the newer durable functions approach, please see my previous article.

A new problem to solve

While we could potentially treat the symptoms here with increased timeouts, and perhaps internal batching up of the updates, it was clear the main issue here really was the attempt to do this within one synchronous request/response cycle. A better approach, at least once we’ve detected the uploaded file was greater than a particular size, would be to switch to asynchronous processing.

Instead we’d implement the web application functionality like this:

  • User uploads the file.
  • The file contents are validated and the number of records provided counted.
  • If the number is less than a cut-off value, proceed with the existing synchronous upload.
  • If it’s greater though, we save the contents of the file to blob storage, send a trigger message to a queue and redirect the user to a “status pending” screen.

The workflow itself needs to:

  • Be triggered by the queue message where it can read an identifier for the process.
  • Load the data from blob storage based on that identifier.
  • Divide the data into appropriately sized chunks and initiate a “fan-out” operation where several, parallel operations run to update each chunk of records.
  • When all are complete, aggregate (“fan-in”) the results and write a summary of the results to a file blob storage.

Finally back in the web application we have:

  • The “status pending” screen doing a client-side poll to a web application API endpoint that checks for the presence of the results summary blob file.
  • If it’s not found, we stay on the page, and poll again in a few seconds.
  • If it is found, the summary results are read the user is redirected to a confirmation screen.

These data flows are illustrated in the following diagram:

Data flow for asynchronous file upload processing.

Given the solution was already hosted in Azure, we had access to the necessary queue and blob storage facilities, and for the asynchronous processing itself, we had a reason to turn again to durable functions. In the rest of this article I’ll share a few details and code samples of the aspects I found “new and improved” since my last chance to work with the technology.

The “leaky abstraction” is plugged

What I found really nice this time round was that this abstraction of storage component really does hold, with the framework seamlessly using compression and switching from queues to other forms of storage as the data size requires. This means there’s no need for the serialisation and compression, and you can truly forget about the plumbing parts of the workflow.

The general flow of a durable orchestration workflow is:

  • A trigger function (in our case initiated by a queue message).
  • This calls an orchestration function, that’s responsible for organising the workflow.
  • Which it does by calling one or more activity functions that carry out the individual tasks.

In the code samples below, you can see an extract of the implementation, where an orchestration is triggered, passing a strongly typed parameter.

Dependency injection

To get around this earlier, I’d use a kind of “poor man’s DI”, where you instantiate components in the function, and then pass them in to a separate class via means of its constructor, where the majority of the work happened. This was useful to support testing to an extent — as you could validate the class behaviour using mocked dependencies — but it wasn’t particularly elegant.

As of Azure Functions 2.0 though, the dependency injection pattern is now fully supported.

The registration of dependencies is carried out in a special class decorated such that it will run on startup. Here you can see we’re registering a named HttpClient as well as a service for accessing blob storage, using environment variables for configuration.

Then in the activity functions themselves, we can use constructor injection to get a concrete instance at run-time. First for the HttpClient:

And then for the blob service:

Unit testing

Firstly the trigger function. This one usually isn’t doing much, but we can at least verify that we’re calling the appropriate orchestration and providing the correct input data:

Then the orchestration. The responsibility of the orchestration function in a workflow is not to do any work with external services itself — in fact there are code constraints you must adhere to to ensure it doesn’t — rather it’s involved with calling the appropriate activity functions, passing the necessary input data and collating the results. We can write a test ensuring that it’s doing this, such as in this simplified example:

Finally the activity functions themselves. In a similar way we can mock the durable functions context class, and verify that it operates as it should:

Retry logic

We do however have some control over retry logic within the durable functions workflow itself.

Firstly, we can call activity functions with retry behaviour, using an object indicating how many times we’d like to retry, and with what delay:

And we can use try/catch logic, with exceptions bubbling up to the orchestration function, where, if appropriate, the FunctionFailedException can be caught and compensatory actions can be made.

To restore the behaviour of a single Azure function, where the message is put back on the queue for processing again, it would be necessary to take specific action to put a new copy of the message on the queue. If the message keeps failing through, you wouldn’t want to keep processing it indefinitely; normally it would be moved after a few attempts to a “poison queue” for further, often manual, handling. As far as I can see though, the dequeue count of a message can’t be set, only read, hence some custom means of tracking this via message content or a header would be needed.

Summary

The Startup

Get smarter at building your thing. Join The Startup’s +789K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Zone

Written by

Zone

We write about design, content, strategy & technology to share our knowledge with the wider community.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +789K followers.

Zone

Written by

Zone

We write about design, content, strategy & technology to share our knowledge with the wider community.

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +789K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store