AWS Step Functions

Prabhu Selvaraj
Xebia Engineering Blog
10 min readMay 4, 2022

1. AWS Step Functions

Step Functions are serverless orchestration services that combine AWS Lambda functions and other AWS services to build business-critical applications.

Through Step Functions’ graphical console, we can see the output of the application’s workflow as a series of event-driven steps.

Step Functions are based on state machines and tasks.

  • A state machine is a workflow.
  • A task is a state in a workflow that represents a single unit of work that another AWS service performs.
  • Each step in a workflow is a state.
  • With Step Functions’ built-in controls, It will be easy to examine the state of each step in the workflow to make sure that the application runs in order and as expected. Depending on the use case, Step Functions can call AWS services, such as Lambda, to perform tasks.

Step Functions can

  • Create workflows that process and publish machine learning models.
  • Control AWS services, such as AWS Glue, to create an extract, transform, and load (ETL) workflows.
  • Creates a long-running, automated workflow for applications that require human interaction.

Since it is a serverless orchestration server, it allows developers to create and manage multi-step application workflows in the cloud. It also helps to convert each task into visual workflows, which in turn helps to build and update the step function quickly.

This article will talk about AWS Step Function features and limitations in Lambda and how to overcome that effectively.

The topics covered in this specific article are:

  • Understanding of AWS Step Functions
  • Step Function Express vs Standard
  • Use Case Demo
  • API Gateway with Step Function.

2. Limitations of Lambda

Lambda is a relatively tiny block of code that will be integrated with AWS services and set to run for a limited period (i.e., 15 minutes as default). Also, it has storage limitations inside as well. Suppose I want to do the following items in the lambda function to implement critical business logic but the answer would be NO at this stage.

  • To sequence the functions of lambda
  • To run lambda function in parallel
  • To select function based on data
  • To retry the same functions
  • To try/catch/finally block
  • To maintain the state of the lambda function and pass the state to the next lambda function. Based on the result of lambda function A, I wanted to call either B or C.
  • To do orchestrations, transformation, and routed based on some condition

3. Benefits of Step Functions

AWS Step Functions is a low-code, visual workflow service that developers use to build distributed applications, automate IT and business processes, and build data and machine learning pipelines using AWS services. Workflows manage failures, retries, parallelization, service integrations, and observability so developers can focus on higher-value business logic.

  • Easy to connect with workflow editor and hence less code would be written
  • Easy to integrate with other AWS services like SNS, SQS, batch, Fargate, etc.
  • Automatic scaling
  • Manage state checkpoints and restarts to make sure the application executes in order
  • It can handle errors, rollback, and retries
  • Logs each state and it is easier to work

4. Step Function State Types

Step Functions are based on the concepts of state machines and tasks. A state machine is a collection of states, the relationship between those states, their input, and output.

States are the individual elements of a state machine. Each state can make decisions based on its input, perform actions, and pass output to other states. A state is referred to by its name, which can be any string, but has to be unique within the scope of the entire state machine.

Tasks are how the states get work done. They can perform work by using an activity, a lambda function, or by-passing parameters to the API actions of other services.

5. Types of Step Function: Express & Standard

A state machine can be created as either Standard (default) or Express Type and both of them use Amazon States Language(ASL) to build the step functions.
The state machine executions will behave differently, depending on the Type selected. The selected type cannot be changed after the state machine has been created.

As mentioned, step Functions have two workflow types i.e., Standard workflows and Express workflows. The Standard workflows have exactly once workflow execution and can run for up to one year, the Express workflows have at-least-once workflow execution and can run for up to five minutes

Standard Workflows are ideal for long-running, durable, and auditable workflows. It can run for up to a year and it can retrieve the full execution history using the Step Functions API, for up to 90 days after it completes the execution. Standard Workflows employ an exactly-once model, where the tasks and states are never executed more than once unless it has been specified with Retry behaviour in ASL. This makes them suited to orchestrating non-idempotent actions, such as starting an Amazon EMR cluster or processing payments. Standard Workflow’s executions are billed according to the number of state transitions processed.

Express Workflows are ideal for high-volume, event-processing workloads such as IoT data ingestion, streaming data processing and transformation, and mobile application backends. They can run for up to five minutes. Express Workflows employ an at-least-once model, where there is a possibility that an execution might be run more than once. This makes them ideal for orchestrating idempotent actions such as transforming input data and storing it via PUT in Amazon DynamoDB. Express Workflow executions are billed by the number of executions, the duration of execution, and the memory consumed.

Standard and Express Workflows can automatically start in response to events such as HTTP requests via Amazon API Gateway (fully-managed APIs at scale), IoT Rules, and over 140 event sources in Amazon EventBridge.

Standard vs Express Workflows

Step Functions integrates with multiple AWS services. To combine Step Functions with these services, use the following service integration patterns:

6.1. Request a response (default)

  • Call service and let Step Functions progress to the next state after it gets an HTTP response.

6.2. Run a job (.sync)

  • Call service and have Step Functions wait for a job to complete.

6.3. Wait for a callback with a task token (.waitForTaskToken)

  • Call service with a task token and have Step Functions wait until the task token returns with a callback.

The table below shows the available service integrations and service integration patterns for Step Functions.
Standard Workflows and Express Workflows support the same integrations but do not support the same integration patterns.

  • Standard Workflows Integrations:
  • Express Workflows Integrations:

Express Workflows do not support Run a Job (.sync) or Wait for Callback (.waitForTaskToken). Optimized integrations pattern support is different for each integration.

7. Use Case Demo for Express and Standard

7.1. Standard Workflow

Create a standard workflow.

Note: Also, it provides an option to select which one to be selected either standard or Express based on the requirement.

Once the standard type is selected, the below screen will appear, and in that, based on the business logic select the Action and Flow to create state machines like below.

In the below screenshot, Choice State has been selected to invoke the corresponding Lambda function which satisfy the condition

In the below screenshot, Click on each action like “Lambda Invoke(1)” in the workflow (below) and configure the lambda function’s ARN. Similarly for the other “Lambda Invoke” Action as well.

Select the “Choice State” and configure the different rules and conditions to satisfy.

The same graphical editor code can be written in the Definition section. Can be viewed by clicking the definition section on the right-hand side.

The state function can be executed by clicking on Start Execution by providing the input message.

7.2. Express Workflow

Similarly, Express workflow can be created by selecting the option while creating the state machine and the rest of the other process would be the same as mentioned in the above section (Standard section).

But while running the express state workflow there would be two options given and which will be discussed below in detail in the next sections.

a. run as Synchronous

b. run as Asynchronous.

7.3. Express Workflow with Synchronous and Asynchronous option

There are two types of Express Workflow are available to choose at the time of executing the express state machine.

  • Synchronous Express Workflows start a workflow, wait until it completes, and then return the result. Synchronous Express Workflows can be used to orchestrate microservices and allows the development of applications without the need to develop additional code to handle errors, retries, or execute parallel tasks. Synchronous Express Workflows can be invoked from Amazon API Gateway, AWS Lambda, or by using the StartSyncExecution API call.
  • Synchronous Express execution API calls do not contribute to the existing account capacity limits. Step Functions will provide capacity on-demand and will automatically scale the sustained workload. Surges in workload may be throttled until capacity is available.

After execution, the output would be Synchronous

  • Asynchronous Express Workflows return confirmation that the workflow was started, but do not wait for the workflow to complete. To get the result, the Client app must poll the service’s CloudWatch Logs. Asynchronous Express Workflows can be used where the immediate response output is not required, such as messaging services, or data processing that other services don’t depend on. Asynchronous Express Workflows can be started in response to an event, by a nested workflow in Step Functions, or by using the StartExecution API call.

After execution, the output would be Asynchronous. Output won’t be returned quickly; the client application must poll the CloudWatch log to see the output by passing the “executionArn”

8. Integrate API Gateway with Step function

The below example shows how to invoke a standard step function (asynchronous express workflow) with the action “StartExecution” action

Note:

  • StartExecution action is not idempotent and hence it will not be supported for EXPRESS WorkFlows
  • Use DescribeExeuction action to get the output by passing the ExecutionArn value to poll the output logs from the cloudwatch.
  • StartSyncExecution is not available for STANDARD workflows, and it will be supported for EXPRESS workflows

9. Pricing of Step Functions

9.1. Standard Workflows

Step Functions count state transition each time a step of the workflow is executed. It will be charged for the total number of state transitions across all the state machines, including retries and error handling. The Free Tier account includes 4,000 free state transitions per month. All charges are metered daily and billed monthly.

E.g.: $0.025 per 1,000 state transitions execution per month, which means $0.000025 per state transition execution.

The billing is for the number of state transitions per month above the free tier usage.

9.2. Express Workflows

With Step Functions Express Workflows, it is like pay as you go. It will be charged based on the number of requests, a workflow executes, and its duration.

Step Functions Express Workflows count as a request each time it starts executing a workflow, and the charges will be for the total number of requests across all the workflows. This includes tests from the console.

Duration is calculated from the time the workflow begins executing until it completes or otherwise terminates, rounded up to the nearest 100ms, and the amount of memory used in the execution of the workflow will be billed in 64 MB chunks.

Memory consumption is based on the size of a workflow definition, the use of a map or parallel states, and the execution (payload) data size.

9.3. Additional Charges

There is an additional charge if the operation of the application workflow utilizes other AWS services or transfers data. For example, if the application workflow invokes an AWS Lambda function, then it will be billed for each request and the duration of each Lambda function.

Data Transfer — External data transfer to and from Amazon EC2

AWS Lambda Price — Request and duration

Amazon EC2 pricing — On-Demand, Reserved, and Spot instances

10. References:

1. What are AWS Step Functions? — AWS Step Functions (amazon.com)

2. AWS Step Functions Pricing | Serverless Microservice Orchestration | Amazon Web Services

3. Actions — AWS Step Functions (amazon.com)

--

--