Azure Durable Functions (part 1 — Intro)
The goal of this article is to introduce Azure Durable Functions. Starting with the basics, we’ll walk through the simplest setup. The plan is to add more articles building on this initial setup to explore more complex architectures and designs in the near future. A basic understanding of Azure Functions will be helpful.
Update 3/14/19 — I’ve added another article digging deeper into Durable Functions available here
The Serverless Architecture movement makes a lot of sense as workloads shift to cloud platforms: consumption billing, microservice oriented, event driven, and the general decoupling of workloads from their associated hardware. Cloud vendors and developers alike see the power in simply writing code to perform an isolated unit of work, and then only paying when its performing work. Isolated workers that do a single (independent) piece of functionality can also achieve massive scale, be globally distributed, support HA and many other enterprise features. While it can be difficult to adapt an existing application to leverage serverless, it can be a great (and fast) way to get new apps and features off the ground.
Enter Azure Functions as one of Microsoft's services to capture this type of application workload. Azure Functions allow you to write the functions that constitute your business logic, without most of the framework and infrastructure typically required. You are billed on consumption and have an ever growing list of supported triggers (storage, queue, HTTP, Cosmos DB, and many more), making integration and initial configuration easy. Full documentation here.
So far, Azure Functions seem great: pay for what you use, lots of integration triggers, easy deployment, just write your logic and little else, etc. etc. so whats the catch? Well, once an application begins to scale to many Azure Functions, each performing parts of a larger process (take processing a Purchase Order as an example), coordination, retrying, logging, concurrency, etc. all start to become issues that need to be addressed in the system. Many times your application will require a state for the processing of complex tasks (i.e. whats been processed so far, whats left, what errors have occurred, or just driving different processing paths depending on the input or intermediate results). With a single Azure Function with all your code in it, life was much easier, because everything was right there, you could share and process whatever common state you needed. This is not so easy when dealing with multiple Azure Functions. They are meant to be stateless, and much like a function, take input, perform some operations and give an output (although Azure Functions often have side effects, change states, etc. so they are rarely pure functions). What you are then left with is writing your own orchestrator/coordinator to deal with all the dependencies with multiple interconnected Azure Functions.
Enter Durable Functions
Durable Functions are essentially Azure Functions that use the Durable Function extension to allow for new types of Functions and triggers (We’ll drop the Azure part and just call them Functions now). The Orchestrator Function and Activity Function (you can find Microsoft’s explanation and documentation for Durable Functions here) are the main ones we’ll focus on for this demo.
The Orchestrator Function as the name suggests allow for scheduling and calling of Activity Functions in a repeatable and reliable way. It can chain Functions together, call multiple Functions in parallel, replay Functions, and many other complex coordination activities. The Activity Functions become the workers that get called to do the business logic (calling APIs, accessing/manipulating data etc.). Behind the scenes the Orchestrator and Activity Functions are using queues and other storage structures to interact, and maintain state/history, but this is dealt with behind the scenes for you. Your Orchestrator can compose complex business logic across multiple Functions with minimal code.
We are going to build out a demo Durable Function using Visual Studio, which chains several calls together. The Orchestrator Function will start multiple Activity Functions, wait until they have all completed, then aggregate their responses together.
- Visual Studio 2017 (v 15.3 +, this demo was made with 15.8.1 Community Edition)
- Latest Azure SDK for .NET (here)
- Azure Storage Emulator (this should come with the sdk, but check just in case)
- Azure Functions Extensions (installed through Visual Studio Extensions, the version used here was 15.8.5023.0)
Create a new project:
Create V2 Function type with empty trigger and access to storage emulator:
Lets build and run our empty project to make sure everything is installed and working, you should have no build errors and see the Azure Function startup screen similar to:
If you get a warning about setting the Function Worker Runtime, make sure your local.settings.json file has the following entry under the “Values” section:
Great, our Function is up and running, now lets add the Nuget packages for creating a Durable Function:
- Microsoft.Azure.WebJobs.Extensions.DurableTask (v 1.5)
NOTE: I was actually getting an error when trying to install the above Nuget, I was getting a vague Newtonsoft error about a mismatch, I ended up having to upgrade the Microsoft.NET.Sdk.Functions from 1.0.6 to 1.0.14 and then everything worked. Depending on your Sdk/Visual Studio version, you may see some strangeness, installing the latest version of both should fix it.
Next we need to add a Function so we can actually do something useful with our project, hold Ctrl+Shift+A to bring up the screen to add a new item and select Azure Function:
Hit “Add” and then set the type to Orchestrator Trigger:
This will now scaffold your “hello world” Durable Function. Lets walk through the Functions that were created.
If the example Visual Studio creates for you differs from the code highlighted below, the full example file is available here:
The Orchestration Client is how we get the process started. The key here is the DurableOrchestrationClient, the client is how you interact with the Orchestration Functions, including: starting, stopping, getting status, etc. Here you can see we are calling StartNewAsync on the Orchestrator Funtion “Function1” (NOTE: the Function names are the names that come from [FunctionName(“FunctionNameHere”)], not the actual method name). We can also pass input into the Orchestrator using the second parameter, but for now its just null.
Function1_HttpStart happens to be an HTTP trigger but it could be any type of trigger, the important part is starting the Orchestrator.
The other important part is that we are returning a status, not the finished result. Once we hit this HTTP trigger, we call the Orchestrator which returns URLs for getting status back, that way the HTTP caller is not waiting on the wire for what may be hours while the Functions process.
Note the trigger type of this function is OrchestrationTrigger, this is triggered from our Client in the first Function. We are passed a DurableOrchestrationContext, this is how we call Activity Functions which this orchestrates, as well as get input parameter, or start a timer. In this case you can see we call Activity Functions.
outputs.Add(await context.CallActivityAsync<string>("Function1_Hello", "Tokyo"));
outputs.Add(await context.CallActivityAsync<string>("Function1_Hello", "Seattle"));
outputs.Add(await context.CallActivityAsync<string>("Function1_Hello", "London"));
Above is the real work in the Orchestrator, using our DurableOrchestrationContext we are calling the Function “Function1_Hello” three times with different inputs, then putting the results of each call into our output object. Since we are awaiting each call consecutively this is an example of Function chaining. This is where we could implement our retry policies, or do a fan out policy (running the calls in parallel), or other processing flow logic.
The Orchestrator Function as the names says should be the Function to call the Activity Functions which do the real business logic, the Orchestrator’s job is to orchestrate, calling Activities, passing inputs and outputs between Activity Functions.
Orchestrator Functions should be deterministic (ie multiple re-runs should produce the same output). This is because behind the scenes, the Orchestrator will replay logic if something fails, times out, or other conditions, so if the Orchestrator is generating values randomly or other non-deterministic operations, there can be problems. We can handle retry logic, try/catch etc. in the Orchestrator as well.
Notice the trigger type is Activity Trigger, meaning its started from an Orchestrator, and this is the real worker of the system. The Activity Function is the one that does the API calls, or logic/processing etc. and actually has the business logic. In this case its just getting a string, and returning a string back.
Lets try it all out:
When you start the project, it will tell you the path that the HTTP trigger is running on.
Try the URL in your favorite REST client, i’m using Insomnia below:
Notice we get back the URLs to perform various operations against the running Durable Function, not the actual processing response. If we hit the statusQueryGetUri we can see what is happening, paste it into a browser and lets check out our response…
Success! You can see the three Activity Functions each ran, reported their values back to the orchestrator, which aggregated them together. We could extend the orchestrator to call another Activity Function to save these values to Cosmos, or put a result to a status queue or similar.
This is an extremely simple example, but shows how the Durable Function framework can coordinate Functions to chain logic and work together. You can build from this base to create more complex workflow. The plan is to get another article to show some more advanced flows building on this setup as well, so stay tuned.