Monitoring Serverless Applications Across Multiple AWS environments
Welcome to my first blog! Let’s imagine stones in water as microservices. It seems easy to monitor one stone based on the water movements, but how easy will it be to monitor all of them as the current changes (and therefore changes your view)?
The same situation is presented when we have a simple Serverless application. Imagine an application that is non-provisioning, with no application upgrade, requiring a person to individually log in to the account, navigate CloudWatch to look for any errors or warnings. There is no need to install any agent to collect the logs or setup metrics because there is no server maintenance from you. There aren’t many metrics to worry about scalability and high availability because AWS takes care of it all behind the scenes. Seems so easy, but what happens if we have a complex serverless application running across multiple AWS environments?
In an enterprise organization, some of the most critical things are controlling and managing the logs, storing the logs in a secure way, and centralizing storage. Centralizing your logs saves time, increases the reliability of your log data, and allows you to filter for the most significant security data, critical for the auditing process and compliance in a large organization. First, let me show you a solution to centralize logging multi AWS accounts.
Centralized Logging Across Multiple AWS Accounts
Let’s suppose you have two AWS accounts, the first one is the AWS Application Account, which is used to host the serverless application, and the second one is the destination, AWS Monitoring Account, which is used to collect the logs that are generated by an application hosted in the Application Account.
Now Let’s Get Started with Monitoring Account
The diagram shows in input source logs will subscribe to stream all CloudWatch logs to a defined destination in the Monitoring Account via subscription filters. In the destination Monitoring Account, a Kinesis Data Stream is created to receive streamed log data and a log destination is created to facilitate remote streaming, configured to use the Amazon Kinesis Data Firehose Stream as its target. The Amazon Kinesis Data Firehose Stream is created to deliver log data from the data stream to Amazon S3 for debugging and Elasticsearch for Analytics purposes. The delivery stream uses a generic AWS Lambda function for data validation and transformation.
CloudWatch Destination (Input Logs)
This service enables you to subscribe that resource to a stream of log events. All the logs that the Application Account put in CloudWatch destination are shifted in Kinesis Firehose.
The code below shows you how you can define in Terraform:
AWS Kinesis Firehose
Kinesis Firehose allows you to ingest data from different data producers, running a Kinesis agent on a machine, with your own custom application that does this, or CloudWatch destination in the case of serverless application as explained above (refer to CloudWatch Destination above). Amazon Kinesis Firehose is the easiest way to reliably load streaming into different destination end points to accomplish different use-cases in parallel.
You can also set up rules on how to put the data in destination, buffering data in input by time and by size (e.g., push data in Elasticsearch every 5 minutes and every 100 megabytes). In AWS, there are two services (Kinesis Firehose and Kinesis Streams) that can ingest data in real-time and deliver this data to destinations. To simplify the choices, I will explain some of the differences between the two services to better understand the use-case they fit.
Kinesis Stream VS Kinesis Firehose
There are three things to keep in mind when using Kinesis for streaming data: (ingest data, transform/validation and delivering in destination). The way Kinesis Streams and Firehose ingest data is quite similar but delivering the data to the destination has differences:
Kinesis Stream requires custom code to take the data from the stream and apply the business logic to it; so the stream doesn’t care if this data was used in destination by the customer. Once the stream buffer windows expire (in 24 hours by default), the data is gone from the buffer. Kinesis Stream doesn’t have a manage consumer to take the data on the stream and put the data in the destination.
Kinesis Firehose, on the other hand, ingests data automatically so you don’t need to write code or manage this. Firehose also allows you to transform/validate data before delivering in the destination.
So, what service you should use? It depends on your use case. Personally, I prefer Kinesis Firehose as I don’t want to write code, so I want all of this is done automatically by Kinesis.
Here is example code of Kinesis Firehose using Terraform syntax:
The code below shows you Kinesis Stream resources using Terraform syntax:
After Kinesis ingestion logs, we need to validate/transform these logs before delivering to the destination. The right service to do that is Lambda Function. Lambda is a serverless service and allows you to run the code without provisioning or managing servers. Firehose trigger Lambda sending the event in input. This function does transformation and validation on that data, which is residing in the Kinesis Firehose and changes it before it’s written into destination. You could actually identify those data components that are not needed in your requirements. Maybe the transformation failed, and you could put it into an S3 bucket if you want to debug later. Lambda function python code shows your log transformation and logs validation before delivering in the destination. Lambda decodes the data attribute records (The data attribute in the Kinesis record is Base64 encoded and compressed in gzip format) and writes the back to Kinesis.
S3 Bucket(Backup Logs)
S3 is a managed storage service that provides persistent storage, and can be used to provide backup of the logs for debugging later. Define log retention requirements and lifecycle policies early on, and plan to move log files to cost-efficient storage locations as soon as practical.
Below code shows you how to define an S3 bucket using Terraform:
After validation and transformation of data, we have to shift them to an endpoint for analytics purposes. Elasticsearch is a popular open-source solution for both log analytics and full-text search. Amazon offers you a fully managed solution around that. It takes care of the heavy lifting around deploying, scaling, and Kibana is enabled by default, so you don’t need to install. You can focus on your logs analytics solution, visualizing data using Kibana, and you can query programmatically if you want. Elasticsearch is easy to use, highly scalable, highly available, and automatic failover recovery.
The below code shows the configuration of Elasticsearch using Terraform syntax:
“You can build a data stream, aggregate data, processing streaming data and store and query data. Loading continuously to the destination you choose and processing the data in real time, whatever kind of data will be such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications”
Now Let’s Get Started with application Account
Let’s assume you have a sample “set time” application in AWS that consists of an API Gateway endpoint and backend AWS Lambda function. Both of these services generate logs in Amazon CloudWatch. The following diagram illustrates the relationship between the frontend and backend layer. In this case, API gateway invokes the Lambda function, IAM Lambda Role, and IAM API Role, allowing us to delegate access to these services to write the event logs in CloudWatch.
AWS Serverless Application Model (SAM)
To make deployments easy and fast, I will use AWS SAM, a framework that allows you to build serverless applications on AWS. You can define a resource, outputs, etc. AWS SAM is also an extension of AWS CloudFormation. You get the reliable deployment capabilities of AWS CloudFormation, and you can define resources by using AWS CloudFormation in your AWS SAM template.
The following code shows you how to define AWS Lambda and API gateway in AWS SAM:
Here is an example of my application “set time” written in NodeJS.
You can deploy easily from your local terminal using this command:
sam package -- template-file template.yaml -- s3-bucket BucketName -- output-template-file packaged.yaml -- profile profileName -- region AwsRegionsam deploy --template-file packaged.yaml -- stack-name myStack --capabilities CAPABILITY_IAM --profile profileName-- region AwsRegion
CloudWatch Log Subscription
Once you have collected the application logs from the Application Account in Amazon CloudWatch, AWS CloudWatch log subscriptions allow you to deliver the events to an Amazon Kinesis Stream located in the destination Monitoring Account.
The following code shows you how to define CloudWatch Log Subscription using Terraform syntax:
“Cloudwatch acts as the central log management for your applications running on AWS. You can send application logs to Cloudwatch and see how easy it is to move these logs to the Monitoring Account. No matter how many applications or AWS accounts you have, in one minute you can move all the logs to the Monitoring Account.”
This is a Cloud-Native centralized logging solution that enables you to collect, analyze, and display logs on AWS across multiple accounts, leveraging a combination of serverless services (enables you to build modern applications with increased agility and lower total cost of ownership) and managed services (enables you to quickly and easily deploy your cloud infrastructure). Leveraging AWS Service ElasticSearch for log analytics that streamed logged data points in a scalable way, with custom business logic. The following image shows you the full centralize logging cross two AWS Account, Application Account and Monitoring Account.
Once the Monitoring account is set up, you can easily add as many accounts as needed, or deploy logs in different destinations. This solution can easily be adapted with some third-party tools (Splunk, Datadog, SumoLogic, there are some additional costs but you can create or use your own custom analytics solution.
Author: Zamira Jaupaj
Feel free to share and or drop a comment below or send me comments in twitter @zamirajaupaj