Scheduling automated deletion of AWS Cloudformation stacks

Prathamesh Wagh
Capmo Stories
Published in
4 min readOct 22, 2021

When developers within a team are working on a feature, they are often faced with the situation that each developer is trying to work on a certain aspect of the feature that requires a different version of the infrastructure than we’ve deployed via a Cloudformation(CF) stack, be it an ECS(Elastic Container Service) cluster with multiple task definitions, or when using any other service provided by AWS.

We at Capmo also ran into a similar situation within our teams. The challenge was not the multiple versions of our infrastructure being deployed: It was how were we going to manage the clean up of the Cloudformation stacks once their work is done. That’s when the idea of auto-deleting the CF stacks came up. After doing some research on our side, we found out that the benefit of owning a piece of infrastructure stack for testing a feature makes the process more agile, flexible and reduces any dependency that might arise when subsequent infrastructure deployments take place, overwriting the previous version of the deployment.

Since our infrastructure deployments are taken care of by CDK, we wanted to have a solution in place which we can integrate with our existing Cloudformation stacks without any hassle. So we decided to package our solution that can be used in all our microservices if needed. And also with the possibility of making it open-source at some point in the future.

Automating deletion of Cloudformation stacks

Now that we knew what we wanted, the question was how to build the solution. We would need a trigger that would initiate the auto-deletion of the Cloudformation stack. This can be achieved with the help of an event, for which we used an EventBridge rule. EventBridge rules have the ability to match an event pattern and trigger its targets. The target of choice was a Lambda function that would initiate the auto-deletion sequence. A serverless solution was a clear answer to this as it makes write and deploy easier, with no effort of managing the deployed resources. All of this was deployed within the same Cloudformation stack that we wanted to delete. Now let’s dig deeper into how we implemented this.

The first thing we had to do is to define an EventBridge rule. To achieve this, we needed a Schedule object and a computed cron expression that would evaluate to true after the specified number of minutes (referred to as TTL in this blog) have elapsed since the stack was launched. You can find more details about scheduling expressions here.

You can calculate a cron expression using the following snippet.

Once we had the cron expression, we used it to create an object of Schedule type.

const cronExpirationSchedule = Schedule.expression(cronExpression(new Date(), 10));

Using the above cron expiration schedule object, we defined a Rule that can target a Lambda function.

Now that we had a rule in place, we needed a Lambda function this rule would target. Here, for the sake of simplicity, we went with a Python runtime for our Lambda function and the below-mentioned gist was used as an inline piece of code to execute when the Lambda function was invoked.

Once we had the code that would be executed by our Lambda function, we defined the Lambda function with its runtime (Python in this case) and the value of theSTACK_NAME parameter that the function written above would be using.

The final step for completing the Lambda function definition was to add a policy role statement that would enable the Lambda function to perform the necessary action on the Cloudformation stack. In this case, deleting the stack.

const statement = new PolicyStatement();statement.addActions('cloudformation:DeleteStack');statement.addResources(`arn:aws:cloudformation:${stack.region}:${stack.account}:stack/${stack.stackName}/*`);lambdaFn.addToRolePolicy(statement);

Please note that the stack.stackName value should match the value being provided to the STACK_NAME parameter of the Lambda function that will be deleting the Cloudformation stack.

Once you have all pieces put together, your code will look something like this (I have added an interface ITtlProps for typing the parameter of the class constructor):

This serverless solution uses a CDK construct that can be imported into your Cloudformation stack via CDK like this:

new Ttl(this, "stack-ttl", { ttl: 10 });

One important thing to note is that in order to deploy the stack, you need to provide a Cloudformation Service Role that executes the deletion. Otherwise, the stack cannot delete itself.

cdk deploy --role-arn <service-role>

Where<service-role> needs to provide a trust relationship towards Cloudformation itself, such as:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "cloudformation.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}

Adding this serverless solution to your Cloudformation stack makes it flexible to deploy infrastructure for every PR a developer creates, without any hassle, and without worrying about the infrastructure cleanup. One other benefit that can be obtained: When developers are working on features, they do not need to worry about conflicting infrastructure updates during the development process and can rather focus on working on the feature.

However, we need to keep in mind certain edge cases when auto-deleting Cloudformation stacks. Non-empty S3 buckets for example cannot be deleted. A Lambda@Edge function cannot be deleted until the replicas of the function haven’t been deleted by CloudFront. If the output of the resource creation via Cloudformation is being used by any other Cloudformation stack, then the stack won’t be deleted since its resources are being used by some other stack. Before implementing the auto-deletion of stacks, make sure edge cases like these are observed.

If you are interested in checking out our code on Github, you can find it here. It is an open-source solution. Feel free to contribute by creating issues for providing feedback or suggesting enhancements.

--

--