Unleash Your Legacy Systems: Build a Serverless Layer for Safe Traffic Management with AWS

Published in

Storm Reply

8 min readJul 3, 2023

This article explores the use of Serverless computing and Amazon Web Services (AWS) to create a middle layer between the world and a legacy system, which can effectively handle sudden spikes in requests and maintain a consistent and manageable throughput to the legacy system.

Managing a legacy system can be a challenging task for many businesses. Legacy systems are often outdated, complex, and difficult to maintain due to their age and lack of documentation.

One of the significant issues with legacy systems is their inability to handle sudden spikes in requests. When a legacy system is overwhelmed with requests, it may crash or become unresponsive, leading to a poor user experience and potentially damaging the business's reputation. Additionally, scaling a legacy system can be time-consuming and expensive, requiring significant investment in hardware and software upgrades.

Despite that, various industries, like Financial Services and Healthcare, still heavily leverage old Legacy systems due to the high cost of upgrades, the risk of data loss or errors, and the need to maintain regulatory compliance and data security.

Unlock the data in your Legacy System

Exposing your Legacy system through API can truly explode the value of your stored data. Integrations can be built to enable data flow and streamlined processes.
Developers can build new applications on top of it, bringing innovation and possibly new revenue streams to your business.

Furthermore, this process can speed up the digital transformation, allowing you to gradually modernize your system, reducing the risks associated with large-scale system replacement.

All of these come with the complexity of dealing with a system that often cannot sustain unpredictable traffic. Therefore, we need to take the necessary precautions and ensure that our system is secure to prevent it from failing or being compromised.

Let’s see how to do that in this article.

The Solution 💡

The idea behind this is to build a Middle layer using a public cloud vendor that will absorb the unpredictable traffic coming from outside while pumping requests at a fixed throughput to the Legacy System.

This process will be asynchronous; therefore, leveraging REST APIs only will result in having to poll our API for the result from the legacy system.
Instead, we could leverage WebSockets to hold a connection and be notified when the server processes the request.

Using a cloud vendor, such as AWS (which we will employ for this solution), also means taking a big step in digital transformation, which could lead your business into its journey to the cloud.

The service layer we will create will leverage AWS Serverless services only to ensure a robust and scalable system regardless of traffic.

The Architecture 📝

The general idea is to divide our layer into two components: a Buffer and a Consumer.

The Buffer (left) will accumulate the requests from the clients.
The Consumer (right) will retrieve the requests at a steady pace, feeding them to the Legacy System and processing the response.

Initially, the request is made to the AWS AppSync GraphQL API. This API will then leverage an AWS Lambda service to pack the request with an identifier and send it to an Amazon SQS Queue.

This “Request identifier” can then be passed as the argument of a GraphQL Subscription to create an active connection that will wait for a Mutation to create a model with that ID, which will then be notified to the listeners on that subscription.

Meanwhile, the second component (Consumer) will be retrieving processing requests from the SQS Queue.
To create a consistent service that pulls from the Queue multiple times each minute, I borrowed a clever solution described in this article by
Zac Charles.
The basic idea is to set up an Amazon Eventbridge to schedule an AWS Step function each minute, while inside the step function, we can use a “Map” block with max-concurrency = 1 to re-iterate multiple times for each step function.

During this iteration, the first AWS Lambda service will pull multiple items from the Queue, then create a “Parallel” branch for each item retrieved.
In these parallel blocks, another Lambda will be processing the request, interacting with the Legacy System, parsing the response, and posting a mutation to the original AppSync GraphQL API.

The mutation does not need to be connected with any resolver or persist in any data. It will be used to trigger the related subscription only. To do so, a NONE data source can be leveraged when creating the mutation in AppSync.

Finally, thanks to the subscription, the client will be notified of the response passed from the legacy system and can retrieve and use the requested data.

Controlled Access with AppSync fine-grained authorization

AppSync allows you to manage authorization and authentication in-depth, allowing you to specify different Authorization types for each model defined in your GraphQL Schema.

In our case, we do not want anyone but our Lambda service to access the mutation that will update the client through a subscription. Therefore, we can protect it with an AWS_IAM authorization type, enabling only the IAM Role of the lambda service to access that mutation while keeping a more generic authorization type for all the other resources (like OPENID_CONNECTor AMAZON_COGNITO_USER_POOL).

Configuring the Step function based on the desired Throughput

The Consumer component is made to be fine-tuned based on the specific needs of the legacy system. We have the following “free variables” that we can play with:

Number of Iterations (Map block)
Number of parallel executions (Parallel block)

The former can be increased freely with the only constraint of assuring that the entire Step function execution will last less than a minute.
The latter needs to be tailored to the maximum concurrency the Legacy system allows. If the system can only manage 10 requests/second, then we can set our system to consistently produce them by creating 10 parallel branches.

Another important factor to consider is the execution time of each Lambda Function.

Get Batch Messages
This Lambda always needs to poll from the Queue for 1 second to avoid sending more requests than configured per second.
Process Requests
This service will interact with the legacy system, therefore, we cannot accurately predict how long it will last, but in general, its duration may affect the whole system's efficiency.

With Step Functions is possible to specify whether to invoke a Lambda function Synchronously or Asynchronously by passing a different Invocation Type. Of course, both types have their drawbacks for our design:

Synchronous Invocation:
✅ Easy error handling and retry of failed requests;
❌ Long-running lambda can heavily impact the number of requests/minute our system will be able to handle;

Asynchronous Invocation:
✅ Throughput will be steady as we don’t wait for the service to return;
❌ Hard to manage and recover from errors;

Why Appsync and possible alternatives to GraphQL

Using AWS AppSync in this architecture helps simplify the design and reduce costs.
Differently from a REST API, GraphQL natively supports long-lived connections through the mechanism of subscriptions. This saves us some time in setting up services and configuring them to accept complex payloads and keep a connection opened to catch the response.

Moreover, using a GraphQL API, we can create a single powerful interface to our Legacy System, where we expose data through models, then leave it to the client to decide what combination of data they need to achieve their goal. With this technology, it is possible to build custom queries to retrieve and assemble specific fields of different models in a single request.

There are some cases, though, where the usage of REST API is still preferred to GraphQL. If you are in one of these situations, you could still pursue another path to achieve a similar result without compromising on our Serverless Design.

It is possible to use Amazon API Gateway to create a REST API with the help of an auxiliary WebSocket API to implement the same design pattern. However, this approach would, of course, need some extra engineering to achieve the same result. Therefore, the details are left to be further analyzed in a follow-up article.

Reaching your Legacy System without passing through the public Internet

For security and performance reasons, it is common not to connect to on-premise resources through the public internet.
In reality, most of the time, either an AWS VPN connection or a private connection with Amazon Direct Connect will be configured to allow Cloud services to access on-premise servers

If this is your case, you must slightly update the Architecture to access the Legacy system. The good news is that AWS Lambda can be assigned to a Hyperplane ENI in a VPC. This will allow your Serverless Architecture to connect to the on-premise.

Below you can find an updated Architecture schema that shows how our system would be if we would leverage the Direct Connect to access the Legacy System:

Other Enhancements

Finally, some other enhancements could save money on the Consumer component.

It would be nice, for instance, to have our polling consumer paused if no messages are waiting to be consumed in the Queue.

This could be added to our architecture, by switching on and off the Eventbridge Rule, based on the number of messages in the queue.
It is possible, to monitor the number of messages sent to the SQS queue with the CloudWatch Metric NumberOfMessageSent , triggering an alarm when there are no messages left.

An example of the metric NumbeOfMessageSent being used to trigger an alarm

Ensure to properly configure the alarm because each time the rule is enabled, it takes around 1 minute to start processing requests. Therefore, setting the alarm too sensitive may cause a heavy delay in your system.
Look at this guide if you need help configuring your CloudWatch alarms

Conclusion 🌄

We saw how to put a serverless middle layer between your customer and your old legacy systems to expose them to the world without worrying about spikes in traffic.

Provided the architecture, it is now the time for you to dig deeper into the implementation. I will provide some guides and articles for you that will help you build each part:

Thanks to all of you for reading this article. Please reach out with any questions, and let’s catch in some insightful discussion!
As always:

Happy Building 🍻