AWS Step Function & Serverless Framework

Published in

CreditorWatch

6 min readFeb 28, 2021

A serverless Tale: How we built a bankruptcy search in less than 2 weeks

If you like the medium platform, think about supporting all the writers in here: https://acubeddu87.medium.com/membership

Here at CreditorWatch (the most innovative and customer-centric credit bureau in Australia), we love to play with technology, and we are passionate about all the AWS ecosystem! Taking care of our customers and industry feedback, we had the opportunity to create CreditorWatch Bankruptcy Search to reduce risk when providing credit to companies, sole traders, and partnerships.

With the support of Australian Financial Security Authority CreditorWatch now stores current and historical bankruptcies, updated hourly.

Just a little background

Some time ago (2/3 years ago, actually…), we discovered the lambda function and the FaaS world. Everything started because one of us brought a Raspberry Pi and began to customise it using an endpoint that triggers Lambda function in response to events; from that moment, we fall in love with these products, and we started to use them more and more for our small projects and non-facing customer products, until the day that we decide to create the CreditorWatch Bankruptcy Search. We deploy it to production in less than 2 weeks!

AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you.

What do we want to achieve?

The CreditorWatch Bankruptcy Search is a product created to proactively flags current and previous bankruptcies so that you are never left in the dark!

The final goal is to have a secure, high-performing, resilient, and efficient product that delivers a high match rate using algorithms that ensure variations and allow us to integrate our solution in any service providing extraordinary results!

Which technology stack we choose?

To assist the number of requests, a fast time to market and decouple the logic as much as possible, we decide to use a combination between AWS ElasticSearch, AWS Lambda Function, AWS Step Function and orchestrate everything with the Serverless Framework.

At the time, we had already experience with ElasticSearch (we are proudly serving tens of millions of business’ information for more than 50.000 users). AWS Lambda was a piece of cake for us (we created lots of automation in response to events and various background jobs); the only question mark we had was AWS Step Function.

Why do we choose this?

We decided to write the project using lambda functions because they provide us with the serenity not to manage any server and scale up without even worry about it!
Python's language, combined with a few fantastic plugins from the serverless framework, helps us reduce the time-to-market dramatically.
AWS Step Function was the only unknown variable in the equation, no one of us used before, but as our employee values say:

We take calculated risks because we know our business and are confident in our abilities. Our innovative and nimble approach ensures we proactively deliver quality solutions for our clients, fast.

How do we achieve it?

Whatever we build, we always follow the AWS Well-Architected framework, dividing our solution into 4 different components allow us to have a fast build and deploy, also help us mitigating risk having in mind that each one of the parts is independent, moreover, with all the metrics collected, we can make informed decisions.

The fetcher component
The fetcher takes care to retrieve all the information that we need to use to understand if there are new information from the last time we checked and stores it in a data store.

The Worker component
The worker is the component that queries the datastore to retrieve information about the data retrieved by the fetcher and starts to request detailed information following this ETL process:

Extract the data for the specific item(s).
Transform the data retrieved to make it possible to ingest in different data-store
Load the data in a few different data stores for analysis and/or future retrieval.

The Searcher component
Here is the magic, this is the component that delivers a high match rate, using algorithms that ensure variations (director name, surname, date of birth, etc.) and also comply with the Privacy ACT (a series of rules that needs to be followed when exposing this kind of data).

The Amazon API Gateway component

The API Gateway is designed to provide secure, reliable access to backend APIs for access from any apps that are built internally or by third-party ecosystem partners.

This is the customer-facing entry point for our product; using this service from AWS, we can provide:

Different Lambda Authorizers (such as OAuth or SAML) provides flexibility to the people who want to integrate with us.

Create a Usage Plan custom-tailored for each customer with the desired throttle and quota limits.

Having lots of useful statistics:

Behind the Scene

A really small codebase created in python subdivided into all the different lambda functions is ready to get deployed, but wait one second? How do we orchestrate all the possible steps? Easy answer!

AWS Step Functions:

After Cloudwatch triggers this workflow, The fetcher starts to collect all the information to understand if we have the latest update, passing through The Worker that does the ETL job. Finally, we double check if we have any other jobs pending (Check), and we decide to start again or finish (Evaluate Check) the step function.

Using AWS Step Function, we can create a workflow like this in exactly 62 lines of code using the serverless-step-functions plugin and orchestrate multiple lambda functions; Almost forgot, we even have monitoring:

Execution Event History for the entire Step Function Life Cycle

Retry before failing with throttling and maximum number of attempt

Configuration related to the serverless framework

and of course, the ability to inject environmental variable

AWS Lambda Dashboard showing some environmental variables

How do we achieve its Infrastructure wise?

Actually, we have done nothing apart from configuring the serverless framework and deploying it; behind the scene, he creates the cloud-formation with all the resources you need!

Oh wait, do you remember the 62 lines of code for the AWS Step Function? Ok, add more 148 lines, and you have a cloud formation bringing up 46 resources, including IAM policies, Log Groups, IAM Role, AWS API Gateway, Lambda Function, AWS Step Functions, Cloudwatch Alarm, Cloudwatch events, etc. etc.

Summary

We definitely enjoy working with AWS Lambda and AWS Step Functions; they are really great products. In the future, I think they will dominate the market, taking over containers and the EC2 Instances, definitely, something to look after if you are passionate about serverless and event-driven architecture.