Running Enterprise Workloads at Scale with a Next-Gen Infrastructure-as-Code Platform
A long history of innovation in the cloud with AWS
Intuit has been at the forefront of innovation in the cloud, adopting AWS technologies for infrastructure, machine learning (ML), data analytics, and more, while integrating our own capabilities to create best-in-class customer experiences for more than 100 million customers worldwide.
Today, our core tax, accounting and personal finance workloads operate exclusively in AWS. We execute hundreds of deployments every day to manage thousands of micro services running on EC2, Kubernetes and Lambda, within more than a dozen AWS regions (collection of resources in a geographic area) and thousands of AWS accounts (containers for our resources).
Along the way, we’ve learned a lot about running enterprise workloads at scale, including the value of using infrastructure-as-code (IaC) frameworks to define our configuration of cloud resources. As an early adopter of CloudFormation, we began investing in reusable patterns shortly after its initial release in 2011.
Fast forward to 2022, and we’ve made great strides in building out our next generation IaC platform based on the AWS Cloud Development Kit (CDK). This open-source software development framework enables us to define our cloud application resources using familiar programming languages, and provision them through AWS CloudFormation. As part of this journey, we created and open sourced Cello, a service for running CDK (as well as other IaC frameworks) via a GitOps workflow.
The result is that our developers no longer have to manually set up, provision or configure new hardware or software systems to support their applications. Since everything happens at the code level, in the cloud, they can focus on application development and deployment rather than its resource needs.
Ultimately, this approach is enabling us to accelerate the speed at which cloud applications are developed, deployed, and scaled at a reduced cost. In the rest of this blog post, we’ll explain how and why our experiences have led us to invest in CDK at Intuit, and to build a framework for its secure, scalable execution.
Step 1, Consolidating CloudFormation patterns
Back in 2017, we presented at AWS Reinvent regarding a company-wide initiative to create a reusable library for CloudFormation templates.
With our initial patterns library, we attempted to wrangle the sprawl of infrastructure code into a common location, and to enforce governance, quality and consistency across commonly used patterns. By introducing these concepts for IaC, we were able to apply an open source model of contribution internally (also known as InnerSource), and drive higher engineering confidence with teams in re-using CloudFormation templates. While this was a great first version for us, it had the following limitations:
- Lack of extensibility. Everything had to be vanilla CloudFormation.
- Lack of composability. It was not easy to combine patterns in re-usable ways.
- Lack of repeatable deployments. Consumers were responsible for their deployment pipeline.
CloudFormation part deux
To build on our success, we released a second version of our CloudFormation template library:
- We invested in creating a library of Custom Resources for commonly-used Intuit capabilities to extend CloudFormation support for configuring non-AWS resources. We provided a framework for a developer to add a custom resource locally, or shared with other internal consumers.
- We iterated on our CloudFormation templates, standardizing our stack output values to make it possible to compose a new pattern from existing patterns. For example, a common deployment is a bastion host in AWS. This can be complicated by security policies an organization may have in place, requiring VPC, WAF and security groups to go along with that host. Rather than create a single template with all the components, we created multiple templates which could be composed by consuming standardized exports. Altogether, this has allowed for greater flexibility and compatibility.
- Finally, we addressed a gap in deployments. Our first library did not have a common deployment mechanism, which left it up to engineering teams. This was more work for our developers which had to be repeated by each team. So, we developed an Intuit-specific, centralized deployment service that understands our artifact storage and CloudFormation conventions, includes CI (continuous integration), and can handle deploying patterns to our AWS accounts.
This evolution was a big step forward in enabling our adoption of IaC. For the first time, Intuit engineers could assemble infrastructure from high quality, tested patterns, which could then be deployed via a fully automated CI/CD (continuous integration, continuous delivery) pipeline. All of which could be set up in minutes by a developer via a self-serve experience.
Challenges Of CloudFormation At Scale
While we were able to improve the process of our IaC deployments, we encountered new challenges with managing CloudFormation templates at scale across a large enterprise.
CloudFormation does not scale when you introduce complexity, which is traditionally handled with conditionals or loops. For example, creating multiple sets of subnets within your Virtual Private Cloud (VPC), based on inputs can result in code like the snippet below:
Shifting to a YAML data serialization mark-up language helps make it easier to read, but the complexity is still there due to the lack of flow control and structures. In addition, iterating on changes and testing in CloudFormation is a slow process. Cloud Formation Lint (CFN Lint) and other CI tools help validate your templates, however no true unit testing framework exists, forcing developers to iterate via deploying changes. This can mean long wait times trying to validate even simple changes.
Enter Amazon Cloud Development Kit (CDK)
In 2018, Intuit partnered with AWS on the initial release of CDK. We immediately understood the compelling vision of CDK to describe infrastructure using higher order programming languages and were excited to contribute.
The results of our migration have been well received by both consumers and authors of IaC. The value became apparent immediately when we began describing infrastructure using traditional programming languages. For example, adopting CDK made it possible to simplify the CloudFormation snippet above to the easy-to-read typescript:
When deploying infrastructure, product teams could leverage the power of their integrated development environments (IDEs), higher order software constructs, and code reuse via libraries in their infrastructure code. And, they could easily integrate with existing templates, and consume their exports in CDK.
Running CDK, an open source solution
CDK, like all configuration management frameworks, must be run with very high privileges by an automated system. This left us grappling with an important question: how does Intuit execute CDK with elevated privileges, yet ensure we secure sensitive credentials, provide an audit trail, and enforce minimal privileges?
To address this at Intuit, we created an open source project, Cello, a service for running IaC frameworks via a GitOps workflow. The definition for infrastructure is stored in a Git repository and all configuration updates are pulled by reference to an immutable SHA (secure hash), following the principles of GitOps.
We’re still in the early days…
We’re still in the early days and excited to partner with — and learn from — other organizations in this space. In future blog posts, we’ll dive deeper into our strategy for Cello, how we structure our CDK construct library, and our approach to orchestration of clouds via our developer portal.
If this blog has sparked your interest in tech innovation at Intuit, take a moment to check out open positions at https://www.intuit.com/careers and join our talent community!