Post in this Series
Part I (This post)
Part III (TBD)
Coordinating individual components in a distributed architecture can be a challenging feat. This is especially true when you are not clear on the overall model and interactions between those various components. There are several approaches you can use to coordinate multiple components, but many of these approaches can quickly become complicated and offer little value.
That’s where AWS Step Functions come in. Step Functions is a service that lets you coordinate multiple AWS services or distributed components of your architecture into a serverless workflow. This allows you to quickly build and update apps that are fault-tolerant in the cloud. Step Functions enables you to stitch together multiple components into a single application, allowing you to build really powerful cloud applications.
This post is the first part of a three-part blog post series. In the first blog, I will explain what AWS Step Functions are, why it’s important, and provide a high-level overview of concepts. In the second post, I will walk you through the process of designing Step Functions workflows and help you understand how to think of building applications with Step Functions. In the third and final post, I will walk you through a step-by-step implementation of a simple Step Functions workflow.
The ultimate goal of this series is that you learn the concepts of AWS Step Functions and the benefits of developing with Step Functions. Furthermore, we want you to learn how to design a workflow in Step Functions as well as show you how to implement the designed workflow.
Before we dive in, let me provide you a brief history of Step Functions. Step Functions was first introduced as a way to orchestrate AWS Lambda Functions but was quickly used for use cases outside of Lambda due to its flexibility. Before the introduction of Step Functions, If you had multiple functions that were part of a broader “application,” it was difficult to orchestrate. You would have to write code in each function to call subsequent functions. For example, if you had three functions, function one would call function two, function two would call function three, and so on. If the second function failed for whatever reason, function three would never be triggered. This made it challenging to manage the overall process. That’s where the introduction of Step Functions come into play. Step Functions is responsible for orchestrating and triggering each Lambda Function in this process.
The workflow in Step Functions is known as a state machine. Each state machine within Step Functions is made up of a series of steps, with the output of one step being the input into the next step. Individual states within the state machine make decisions based on their inputs, perform some action, and pass the output to other states. Step Functions support several types of states defined as part of Amazon States Language. The available states are:
- Task: perform some action within the state machine
- Choice: makes a choice between branches of execution
- Fail or Succeed: stops execution with either failure or success
- Pass: passes input to its output. This is used typically for testing
- Wait: delay the state machine for an amount of time or until a specified time in the future
- Parallel: perform parallel branches of execution simultaneously
While this seems like a small number of states, you can build compelling solutions with these states.
To get a better understanding of Step Functions, let’s start by going through an example. In our case, we are going to use an E-Commerce checkout process. In this process, multiple components perform different actions in the workflow.
The general checkout flow starts after a user has added items to their shopping cart and clicks the checkout button. This event kicks off the process. The first step in the process is to confirm the item by checking with the inventory system. Once the product is confirmed, the next step in the process is to check the users’ information, charge the user, and then send out emails confirming the order. Here is a simple diagram of the process:
In Step Functions, the workflow is modeled as a state machine like so:
Visually, the Step Functions is not that different from the workflow diagram. That’s because it isn’t. Each state is a task state that executes the code of the specific component in the overall architecture. When the state machine starts, it will first trigger ConfirmItem state passing in the data provided at the start of the state machine. ConfirmItem is a Lambda function that will execute and return its output. Step Functions will take that output and pass it into the next stat, the CheckUserInfo state. This process will continue until the state machine is complete.
The main benefit of using Step Functions over glue code is that it simplifies the entire process. With glue code, you are responsible for dealing with retry logic, errors in the workflow, logging each of the individual steps, and building a fault-tolerant service. Step Functions relieve you from having to write “glue” code to integrate all of the components. AWS Step Functions handles that for you.
That’s pretty much it for the concepts. To wrap up, AWS Step Functions allows you to build serverless workflows by orchestrating distributed components. It reduces the complexity in building cloud-based workflows by reducing the need for glue code and providing fault tolerance for your serverless applications. In this post, we provided an overview of Step Functions and covered some of the general concepts of Step Functions. We also reviewed an example of a workflow in Step Functions. In Part II of our blog post series, we are going to go through the design process for creating a state machine in Step Functions. This will give you a deeper understanding of Step Functions and help you design your own. If you have any comments or feedback, please feel free to share.