jamesteohart | facebook.com/jamesteohArts

Serverless Test Drive: Orchestrating Azure Functions With Python

Matt Hergott
Published in
8 min readSep 11, 2020

--

One of the top trends in cloud computing is serverless computing, which allows organizations to access powerful cloud services through code without managing the underlying infrastructure. This allows businesses and governments to focus on functionality and analytics rather than on provisioning virtual machines and administering the software stacks on those machines.

Function as a Service (FaaS) is a serverless offering in which developers can publish code to a cloud provider and have the functions run as needed. Popular cloud functions services include AWS Lambda, Azure Functions, and Google Cloud Functions.

There are many benefits to serverless functions, including easy deployment, automatic scaling, and reduced cost since a function typically incurs charges only while it’s in use.

FaaS functions are usually short-lived and lightweight; they perform a task and then disappear. What if we want to combine Azure Functions into a larger system that achieves a more complex goal? We can now do this easily in Python with Azure Durable Functions. Durable Functions allow us to orchestrate our code into sophisticated patterns that can complete jobs far beyond the ability of individual FaaS functions.

This GitHub repository uses the new Python extension for Azure Durable Functions to deploy an advanced regime-switching regression model using 100% Python. It coordinates a collection of smaller functions into a greater whole that estimates a regime-switching model and produces results in the form of charts, JSON files, and email notifications.

Durable Functions in Python

Azure Durable Functions for the C# programming language became widely available on May 7, 2018, with support for JavaScript and F# coming thereafter.

Microsoft released function orchestration in Python on June 24, 2020. This is a public preview, meaning that users can work with it while it is in development. A preview version doesn’t have the standard Azure warranties and service-level agreements.

The Azure team continues to add more options for using Python. For instance, nested orchestrations became available on July 28, although the durable entities described below are not yet supported.

Orchestrated Patterns

Azure Durable Functions are triggered by a client function, which starts the broader choreography by calling an orchestrator function. The orchestrator organizes the arrangement of the activity functions, which do most of the work. Another type of Durable Function is the entity function, which holds a state and can change that state as additional information arrives.

There are several common orchestration patterns:

Chaining: This runs Azure Functions in a sequence where each function begins after the previous one finishes. This differs from traditional sequential programming because each activity function can be its own Azure Function instance, and the rest of the architecture scales to zero (goes to sleep) while an individual component works on its task.

Fan out/fan in: This pattern runs a series of Azure Functions in parallel. It speeds up response time by running independent tasks concurrently. For example, a large data transformation job can be split up into many pieces that get processed at the same time.

Async HTTP: This exposes a status endpoint that indicates when a job finishes. This eliminates the need to hold an HTTP connection open for an extended time, which is an unreliable way to communicate over the internet.

Monitor: This is a long-running Durable Function that runs at intervals to check on the status of another process and take corrective action if needed.

Watcher: Similar to the monitor pattern, this is an Azure Function that runs at intervals and completes an assignment, such as reading data from a website.

External interaction: A Durable Function architecture can scale to zero while it waits for information from an external source, such as a human’s manual input. After receiving the information, the Durable Function wakes up to carry on with its work.

Aggregator: This is a Durable Function that receives a stream of data and organizes the information. This pattern makes use of entity functions to create stateful durable entities.

Durable entities are not yet available in Python, although Microsoft is accepting feedback about how Azure users would want to use this pattern.

Sub-orchestrations: Orchestrated designs can be organized into nested sub-patterns. This increases the range of possibilities for what can be accomplished with Azure Functions.

Billing Plans

This project uses Azure Durable Functions on the Consumption plan, in which ephemeral functions appear and disappear, and they only accumulate costs while they are running.

There are different function plans available on Azure. For instance, if a service has a predictable demand, the Premium or App Service plans can ensure that there are always virtual machines ready to respond quickly. These plans can also accommodate more conventional long-running programs.

The number of function calls is one factor in pricing for the Consumption plan, and a Durable Functions architecture likely increases the number of function requests. For high-volume applications, this might make an alternative billing plan look more appealing.

There are other costs besides function execution. For instance, Durable Functions communicate with each other through messages, and these messages may be quite large for some applications. There are several ways to purge instance history to avoid unnecessary storage charges.

Regime-switching Models

In a traditional linear regression, the coefficients on the independent (x) variables are constant throughout the sample. But many times, a sequence has distinct regimes in which the influence of the independent variables on the dependent (y) variable changes.

If these regimes are predictable, such as day vs. night, or if these changes develop consistently through time, it’s possible to capture these regimes in a standard regression by having the coefficients change dynamically based on other factors (such as time).

Regime-switching models are useful for unpredictable regime changes, such as what we see in weather, stock market volatility, and sentiment analysis. This topic is important enough that it’s inspired the development of regime-switching neural networks.

This GitHub repository is an Azure Durable Functions project that calculates the regime-switching model of James D. Hamilton. [1, 2] There has been a great deal of innovation in these techniques since Hamilton’s initial publications, and a Web search for endogenous regime-switching models gives examples of more recent work. The original Hamilton model does a decent job of finding distinct regimes, and it provides for a good demonstration of Azure Durable Functions.

Azure Architecture

The client function for this application receives an HTTP request that contains settings for the model and the data to be analyzed. It’s possible to send such a request directly to an Azure Function App, but this program uses Azure API Management. Azure’s API service offers many benefits, such as throttling, input/output transformations, authentication and authorization, caching, and a developer portal where users can register subscriptions.

The API gateway for this program passes the request information to the client function, which validates the data and returns an immediate response. If the data fails validation, this reply will contain a description of the error. Otherwise, this message will contain a list of URLs where the outputs will be located.

After passing validation, the client function calls the orchestrator, which initiates multiple instances of the activity function that estimates the regime-switching model. This is the fan-out/fan-in pattern, and it’s helpful because this model uses a specialized iterative procedure created by James D. Hamilton. The algorithm is fast, but this implementation sometimes takes a suboptimal path and returns an inadequate result. Running this routine several times in parallel lets us select a high-quality optimization without sacrificing response time.

After calculating the regime-switching model, the orchestrator creates a second fan-out/fan-in pattern in which certain time-consuming tasks, such as calculating t-statistics, are performed in parallel. Running these jobs concurrently reduces the time it takes to deliver a result.

The orchestrator then calls an activity function that saves the final output to Azure Blob storage. After everything is completed, the orchestrator calls an optional activity function that emails the customer notifying them that the output files are ready. (Alternatively, the customer can poll the URLs from the initial HTTP response until the files appear.)

Requirements

To install and run this application in Azure, one needs an Azure account, an Azure storage account to host the function app and the results, and an API Management subscription.

Within the storage account, this program needs a container named “results” to store the output. This container should have the access level “anonymous read access for blobs only.” This will allow customers to read their output files, which are named with a universally unique identifier.

A SendGrid account is also required to send the optional notification emails.

The storage endpoint, storage access key, and SendGrid key are located in the “local.settings.json” file to avoid placing sensitive information into programming code. The values in this file can be uploaded to Azure when deploying a Function App through Visual Studio Code.

For enhanced security, access keys can be stored in the Azure Key Vault rather than through the settings file.

Testing & Results

This repository has a folder named “test-demo” that contains a Python program for testing the regime-switching application. The demo program creates a data set with three regimes and runs a separate linear regression for each regime, printing the results to the console. The script then calls the regime-switching application through the API gateway. (This testing program therefore requires an API subscription key that grants access to a deployed version of the regime-switching application.)

If all goes well, the results from the regressions should be close to the regime-switching outputs. This is what we see from the testing routine, and this indicates we’ve completed a working example of the new Azure Durable Functions library for Python!

This application gives an example of the type of intelligence that can be a good candidate for Azure Durable Functions. This technique is too involved for most people to program themselves, and it’s something people might seek from a service. On the other hand, this model isn’t nearly at the scale of training a deep learning neural network, which often requires GPU-capable virtual machines.

The new Python extension for Azure Durable Functions is available and capable, and it decreases the barrier to entry for deploying intelligent applications to the cloud. With Durable Functions, people can orchestrate function patterns that use the full array of Azure serverless options, including compute, storage, databases, artificial intelligence, events, and messaging.

GitHub repository: https://github.com/hergott/regime-switching

Matt Hergott is a certified Azure Solutions Architect Expert and AWS Solutions Architect — Professional.

Hamilton Model

[1] Hamilton, James D. (1989), “A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle,” Econometrica 57, 357–384.

[2] Hamilton, James D. (1994), Time Series Analysis, Princeton, NJ: Princeton University Press.

--

--

Matt Hergott

Machine learning and cloud computing specialist