AWS Step Functions: Orchestrating Serverless Workflows Like a Pro

Andreas Kihlberg
AWS Specialists
Published in
7 min readJul 11, 2023
Orchestrating Serverless Workflows like a pro with AWS Step Functions

As businesses grow and scale, it becomes increasingly important to manage the orchestration of cloud-based workflows effectively. Such orchestration is critical to streamline your microservices, optimize resource usage, and minimize costs. This is where AWS Step Functions come in.

What are AWS Step Functions?

AWS Step Functions is a fully-managed service from Amazon Web Services (AWS) that enables developers to create visual workflows for applications at scale. This service lets you coordinate multiple AWS services into serverless workflows, facilitating quick building and updating of applications.

An essential feature of Step Functions is its automatic triggering and tracking of each step in your workflow. It can retire tasks when errors occur, skillfully handling the underlying logic to ensure your application executes in the correct order and behaves as expected.

How to Orchestrate Serverless Workflows

Step 0: Choosing Between Standard or Express Workflows

Before defining your workflow, it’s crucial to understand the different types of workflows AWS Step Functions offers — Standard and Express. Both workflows share many features but also diverge in significant ways, making it essential to discern which to choose based on your specific needs or blend both.

One key difference lies in their pricing models. Standard workflows are charged based on the number of state transitions. Therefore, the cost can escalate with an increasing number of state changes. Conversely, Express workflows are charged by total execution time, meaning you can have numerous state changes, but you only pay for the total execution time. However, remember that this pricing is solely for the step function, and additional costs, such as Lambda, might apply.

When creating a workflow, you can select an express or standard workflow

Apart from the cost factor, the most notable difference between the two workflows is their execution duration and runtime. A Standard workflow can run for up to a year, providing extensive flexibility for long-running tasks. On the other hand, an Express workflow has a maximum duration of five minutes, making it suitable for shorter tasks.

Standard workflows also support a unique feature called ‘Step Functions activities,’ enabling you to run specific functions as an activity outside the state machine. For instance, you could execute a worker on an EC2 instance. This is not available in Express workflows.

In summary, understanding the operational and cost-related differences between Standard and Express workflows can assist you in effectively orchestrating your serverless workflows and tailoring the solution to fit your particular use case.

Step 1: Define Your Workflow and Utilize the Workflow Studio

First and foremost, defining your state machine, a JSON-based, textual representation of your workflow, is critical. Each step in your workflow is defined as a state within this structure. AWS Step Functions defines your state machine using Amazon States Language, a structured JSON-based language.

Consider the following example of a state machine:

{
"StartAt": "FirstState",
"States": {
"FirstState": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-west-2:123456789012:function:HelloWorld",
"End": true
}
}
}

In this JSON structure, the ‘StartAt’ field dictates the initial state, and the ‘States’ object encompasses all the individual states of your workflow. Note that the ‘Resource’ is not strictly limited to a Lambda function of your own but can also be an existing action (see step 2).

In the process of defining your workflow, it’s crucial to make your states independent, facilitating reuse in different workflows and reducing redundancy. Lambda Layers can aid in sharing common code across workflows, further promoting reusability.

A key aspect of workflow definition is incorporating error handling. AWS Step Functions provides built-in capabilities for this, allowing you to specify error handling strategies for each state in your state machine using Catch and Retry fields. This feature can save time and effort as it removes the need to handle errors within each function manually. For instance, if a Lambda function calls an external API timeout, you can let the state machine handle the retries instead of implementing a retry mechanism in your code.

With workflow studio, you can easily design and implement your workflow

While the manual specification of your state machine aligns with Infrastructure as Code (IaC) best practices, AWS offers an alternative, more intuitive method for designing your workflow — the Workflow Studio. With its drag-and-drop interface, the Workflow Studio simplifies the process of arranging components, states, and flow structures. Once your design is complete, the studio generates a JSON representation that can be seamlessly integrated into your code using tools like CloudFormation or Terraform.

Step 2: Create AWS Lambda Functions and Leverage Predefined Actions

AWS Lambda allows you to run your code without provisioning or managing servers. As a result, it becomes a fundamental building block when defining your workflow, where you create Lambda functions for each task in your state machine.

In the ‘FirstState,’ the ‘Resource’ attribute points to a specific Lambda function. When AWS Step Functions executes this state, it triggers the indicated Lambda function. This ‘Resource’ could also refer to an existing predefined action, adding a significant layer of flexibility.

But what if you could avoid writing code for standard operations altogether? That’s where predefined actions come into play. AWS Step Functions has an extensive library of ready-to-use functions called actions, allowing you to avoid coding or running Lambda functions for common tasks.

Take a look at the following state machine example:

A State machine using predefined actions
{
"StartAt": "FetchUser",
"States": {
"FetchUser": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:getItem",
"Next": "PublishToSNS"
},
"PublishToSNS": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"End": true
}
}
}

In this example, we are using two predefined actions — ‘dynamodb:getItem’ and ‘sns:publish’. The state machine fetches a user from a DynamoDB table (FetchUser) and subsequently publishes a message to an SNS Topic (PublishToSNS). This could, for example, trigger a welcome email to a new user.

By leveraging these predefined actions, you can orchestrate complex workflows without writing any code. Along with intrinsic functions provided by AWS Step Functions, this opens up endless possibilities for efficient and robust serverless workflow management.

Step 3: Deploy Your Workflow

You can use the AWS Management Console, AWS CLI, or an AWS SDK to deploy your workflow. Once you’ve deployed your state machine, Step Functions will generate an ARN (Amazon Resource Name) for your state machine.

Step 4: Execute and Monitor Your Workflow

Once your workflow is defined, the next step involves its execution, which can be done manually or automatically. If you opt for automatic triggering, various AWS services, like CloudWatch Events or API Gateway, can be utilized. This adds flexibility and lets your workflow respond dynamically to different system or application states.

After execution, monitoring your workflows for efficient and proactive management is crucial. AWS provides an effective tool for this in the form of CloudWatch Metrics. This service offers key insights into your workflow by providing metrics such as the number of executed state machines, the error rate of Lambda functions, and total execution time. By leveraging these metrics, you can understand the performance and health of your workflows, making necessary adjustments and improvements along the way. Thus, monitoring is as critical as execution, ensuring your workflows run smoothly and as expected.

Bonus Content

Take advantage of built-in flow mechanisms to Extend existing Workflows

The whole idea of Step Functions is to let AWS handle transitions and different steps in a workflow. Therefore, step functions have support for multiple different flow mechanisms, including “Choice,” “Parallel,” and “Map.”

By leveraging these mechanisms, you can reduce the complexity of your function. It will also be easier to extend the workflow with new functionality.

We can look at an example using the Choice state:

In the example above, a state machine processes files of different formats. Of course, we could build a single function with if-statements to handle all different types of files, but as we support more formats, that code will soon be rather clunky and perhaps not very reusable.

If we instead, as the example shows, use a “Choice State,” we can construct cleaner functions that are only responsible for a single file type (Single responsibility principle SRP).

You can still share common code by using Lambda Layers.

Extend the workflow

Let’s say that your application also gets new requirements to support PDF files. By using step functions, you do not have to introduce code changes to existing functions and risk introducing new bugs. Instead, you only need to create a new function and add it to the workflow.

Use Intrinsic functions for basic operations

The intrinsic function is a relatively new feature in step functions that allows you to use built-in functions to manipulate data, such as splitting arrays or doing basic math operations.
In the following example, we use three intrinsic functions, UUID generation, ArrayLength, and Format.

{
"Summary": {
"Type": "Pass",
"Parameters": {
"Original.$": "$",
"UUID.$": "States.UUID()",
"Number.$": "States.ArrayLength($.numbers)",
"message.$": "States.Format('Hello, the array contains {} items', States.ArrayLength($.numbers))"
},
"End": true
}
}

The result after the intrinsic functions are as follows.

{
"UUID": "96da438d-c32d-4333-a78b-3ecc17bee6d4",
"Number": "2",
"message": "Hello, the array contains 2 items"
}

In conclusion

In summary, AWS Step Functions provides a powerful, flexible way to orchestrate Serverless workflows. By following the steps and best practices outlined above, you can streamline your microservices and develop scalable applications like a pro.

--

--

Andreas Kihlberg
AWS Specialists

Web developer with passion for great architecture, smart solutions and new technologies