On DevOps — 10. Infrastructure as Code — Introduction to CloudFormation, AWS CDK, EKSCTL, and AWS SAM

Tiexin Guo
Jun 10 · 8 min read
Some cloud, at some Fjord in Norway

In my last article, I talked in depth about Terraform about how it works, some advantages and disadvantages, and best practices. If you haven’t read it yet, here’s the link to it:

Given the benefits and disadvantages of Terraform, it could be a go-to option for many people who want to get started managing their cloud infrastructure as code. However, there are certain cases that you might want to consider other options, and let’s go over them one by one.

CloudFormation

CloudFormation is AWS’s answer to Infrastructure as code. It was launched in 2011, way earlier than Terraform, and over the years, it has become an indispensable tool for many AWS customers. You can define a template once, then re-use it to provision your AWS resources.

At the beginning of its lifecycle, JSON is the only supported format, and let’s face it, JSON isn’t easy to read, although the official website says that it is a lightweight data-interchange format easy for humans to read and write, especially if you have a relatively large file (in the case of a CloudFormation template you usually do).

In my opinion, JSON format is mainly designed for machines to parse and generate and exchange data, not for humans to read.

It was not until 2016 that AWS introduced YAML into CloudFormation. Before 2016, I personally wouldn’t consider writing JSON to manage my infrastructure at all because it’s not manageable.

YAML, on the other hand, although is hated by many people as well (have a look here, I don’t want to get into this topic of “why YAML is bad”), I’ll have to admit that it is relatively more human-readable compared to JSON because at least it was designed to be human-readable.

Although the syntax affects readability greatly, there is another major topic that affects the readability of your infrastructure code: the length of every single file. If the file length is too long, you will lose your focus while scrolling back and forth trying to find some specific item. Also, even YAML is easier to read; still, when you have a super long file, it’s hard to see where exactly the current chunk you are viewing belongs to.

Most CloudFormation templates are defined in one single file, even the official examples from AWS, which really is one of its disadvantages. While it is declarative, when you read it, you don’t feel like you are reading a description of your infrastructure; instead, it’s hard to form a mental picture of the infrastructure you are managing from one single large file.

In late 2020, this situation is greatly improved because of the introduction of CloudFormation modules. You can create your module and refer to that module in your CloudFormation template. There is one slight downside: the CloudFormation modules must be put into the CloudFormation registry only, so if you want to keep all your source code in a GitHub repo, you must push your modules file to the CloudFormation registry one more time; otherwise, your template can’t load the modules.

CloudFormation is NOT idempotent.

If you try to create the same stack twice, you will get some errors. If, after creating a stack, you want to update it, you need to run the “update” command instead of “create” again. However, if you run “update” on a stack where there is nothing to update, you will get an error as well.

If you want to see what changes are there before applying the change, you need to use another command to create something called a “changeset.”

Being idempotent helps a lot, especially in your CI/CD pipelines, because all you gotta do is something like “apply” without having to write if/else to see if you need a create, or an update, or a create changeset here. From this point of view, it’s not the best tool out there, but you can certainly achieve the same goal by using the changeset.

Although we have mentioned a few limitations or disadvantages of CloudFormation above, there are real benefits:

  • Since it’s an AWS product, you get official support. This could be a significant advantage for teams that are not so technical yet.
  • AWS has many official templates ready for you to re-use, so the chances are, you probably don’t have to write any JSON or YAML at all if you want to provision is something relatively standard because they are already available in existing templates. This could be useful to get to a quick start because you don’t need to learn much (like learning Terraform).
  • Integration with other services. For example, if you use a multi-account setup and you use AWS Control Tower, Control Tower actually relies on CloudFormation to do the initial setup. If you use CloudFormation, you can have everything set up when creating a new account. If you want to achieve the same with Control Tower + Terraform, you will have to rely on Cloud Watch events listening on control tower events, then use Lambda as a target to trigger a webhook that runs your Terraform code to continue from there.
  • Multi-account support: CloudFormation relies on a “stack set” to deploy to multiple accounts simultaneously, and there isn’t any equivalent in Terraform, where you have to run your code on each account separately.

To me, personally, CloudFormation isn’t the best choice because it’s not so easy to read compared to other choices, and the module support came too late, but in specific scenarios like the above-mentioned ones, it could be the best tool for the job.

AWS CDK

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework to define your cloud application resources using familiar programming languages.

AWS CDK uses the familiarity and expressive power of programming languages for modeling your applications. At the moment, it supports JavaScript, TypeScript, Python, C#, and Java. The list will grow.

If you don’t want to invest so much time in learning Terraform or CloudFormation and you want to get started quickly with a programming language which you are already familiar with, this could be the best choice for you.

Apparently, the biggest advantage is, you already know how to use it because it supports your programming language. If your team is mainly JavaScript developers or Python developers, you can get started now without too much learning.

And because of the nature of the programming language, you get maximum flexibility. There will be less pain compared to Terraform when you are trying to figure out how to do for loops for modules or how to form a specific complex output format.

CDK is still significantly evolving all the time, so in some cases, you might not find the expected feature you want. I remember last year when I first started using CDK and tried to create a VPC, it created everything for me, including internet gateway and NAT gateway. If what you wanted is limited networking, for example, without an internet gateway, that wasn’t possible back then.

There are still a lot of open features to be done, so if you want maximum flexibility and support, Terraform or CloudFormation is still the way to go at the moment, but we do see more adoption of CDK over time. At the moment, it has got more than 6k stars on GitHub, and I really think this could be the future.

CDK itself is written in typescript. It is made polyglot by a library called jsiicreated by AWS as well. jsii allows code in any language to naturally interact with JavaScript classes, and it is the foundation that enables CDK to support languages other than JavaScript. Then CDK translates the typescript code into CloudFormation templates.

If you write Python, your Python code uses jsii to call JavaScript classes, then those JavaScript classes translate the code into a CloudFormation template, and it sends the template to CloudFormation to run the stack. This process seems to be a little bit too complicated. It works, and it’s efficient enough, but if you like things simple, you’d be better off writing those CloudFormation templates yourself or even using Terraform.

Is CDK code easier to read than CloudFormation? Yes, and I think we can all agree on this. But is CDK code easier to read and understand than Terraform code and modules? There might be some different opinions because, for some, programs are easier to read, while for others, resources put into different small chunks are simpler to understand.

EKSCTL

If you are mainly using Kubernetes to orchestrate your workloads, there is another choice which is eksctl .

eksctl is a simple CLI tool for creating clusters on EKS - Amazon's new managed Kubernetes service. It is written in Go, and it translates your YAML input file into CloudFormation stacks and executes them.

You can of course create Kubernetes clusters inside AWS using Terraform, but it’s not an easy task because you have to create both the control plane as well as the worker nodes. It becomes even more complicated if you want to create some self-managed worker nodes instead of AWS-managed worker nodes because, in this case, you need to create the auto-scaling group, the launch template, and other components.

You can of course create the cluster using CloudFormation, but again, in that case, you will need to face the downsides of CloudFormation itself, like the readability.

As of early 2020, eksctl still didn’t support customer-defined launch templates for a deeply customized cluster, so back then, Terraform and CloudFormation were your only choices. Things are different now, and it has become more mature than ever. It’s probably the easiest and quickest way to create a cluster. If you are using Kubernetes inside AWS, definitely give this choice a try.

Serverless

If you are mainly running serverless and lambda functions, you might already find that Terraform isn’t specifically designed for that.

Sure, you can create lambda functions by using Terraform, and you can use Terraform template to generate the code, but if you are using the Terraform template, the “template” isn’t the source code anymore, and you can’t unit-test that. Or you can upload your source code to an S3 bucket then trigger your Terraform code to load the code, but in that case, you need to do some automation in your CI pipeline for the lambda deployment.

This is mainly because Terraform is good at managing infrastructure instead of configurations, but deploying lambda functions has a lot to do with configuration.

In this case, if you want to easily create an API gateway and create some lambda functions as the endpoint of the gateway, AWS SAM comes to the rescue.

4th Coffee

DevOps chat

Tiexin Guo

Written by

Sr. DevOps Consultant | Global Financial Services | Professional Services at AWS

4th Coffee

The description is longer, and appears in story footers, search results and the like. Max 280 characters.

Tiexin Guo

Written by

Sr. DevOps Consultant | Global Financial Services | Professional Services at AWS

4th Coffee

The description is longer, and appears in story footers, search results and the like. Max 280 characters.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store