Why You Should Adopt Terraform Cloud First

Craig Sloggett
HashiCorp Solutions Engineering Blog
5 min readSep 11, 2023

When starting your infrastructure as code journey, we recommend you avoid building custom Terraform Community pipelines and adopt Terraform Cloud first.

This post aims to walk through a typical path to cloud maturity, highlight the complexity of setting up a deployment pipeline, and the challenges of maintaining it over time. You will also learn how Terraform Cloud will eliminate complexity and make it easier for your team to manage infrastructure securely, safely, and collaboratively at scale.

From the iron age to the cloud age

The practice of deploying applications to cloud platforms has been in existence for a decade now, however, organisational processes are still reflective of the “Iron Age”, where we glue together homegrown solutions for deployment, configuration, and security. As the demand for cloud consumption grows exponentially, current patterns begin to create toil at a rapid rate for cloud operators.

Contrast this to a more cohesive approach to platform management: the “Cloud Age” focuses on a developer-centric experience by using methodologies like declarative infrastructure, GitOps deployment models, and inner-source platform management. Cloud maturity requires that organizations adopt the “Cloud Age” approach to platform management and avoid some of the following complex processes.

Imperative vs. declarative infrastructure

When you are deploying infrastructure by hand, you must click through a console to specify each configuration setting. To deploy multiple resources, you must repeat this process, specifying exactly how resources must be deployed. This hands-on, click-by-click practice has earned the playful moniker of “ClickOps”, a nod to the labor-intensive deployment of the production platform via numerous console clicks.

As teams move further along the maturity curve and expand, they often turn to tools like PowerShell, Bash, CloudFormation, Bicep, or ARM to leverage a Cloud Service Provider’s (CSP) API to automate each click. This process is efficient, but it can prove to be ineffective at scale. Adopting such an approach requires precise specifications for each deployment step, which raises the likelihood of fragility and errors.

Terraform offers declarative, stateful, infrastructure as code with a vast open source community and standard practices for teams to contribute to the same code base. For these reasons, Terraform has rapidly become the tool of choice for practitioners managing cloud infrastructure at scale.

The long haul

Over long periods of time, what started out as a simple infrastructure as code repository, can turn into using Terraform on “hard mode”. For example, working in a non-versioned monolithic git repository with lots of interdependent modules that manage your entire cloud estate. When managing a few resources, this does not seem too difficult. But as the number of resources, environments, and consumers increases, so does the complexity.

Furthermore, during the time between “nothing is managed by infrastructure as code” and “everything is managed by infrastructure as code”, developers will continue to deploy through the console. The period in which you play catch-up until you can remove access to deploy through the console could be significant.

These day two challenges are solved by spending time planning and designing earlier on to ensure teams are developing the code base intentionally and with scale in mind. Opting out of TFC early on has unique challenges in addition to these common issues.

Accidental complexity

Terraform Community can be scaled to support many development teams, each deploying their cloud resources; however it comes at the cost of complexity. To scale at a large organization, there are several responsibilities you must take on that can easily be deferred to TFC.

A significant one is Terraform state file management: a crucial component for efficient collaborative resource management. This task often involves setting up resources like an S3 bucket for storage purposes and DynamoDB for state locking. The process to deploy and configure these resources must then be repeated for every team onboarded to Terraform.

Additionally, when reviewing changes to the infrastructure as code repository, we recommend attaching the output of the Terraform plan as part of the merge request description. This allows teams to validate the proposed changes in the request. Using a central server to generate the plans ensures the output is accurate. Multiple servers would need to be configured in a load-balanced architecture to guarantee reliability further. If you have hundreds of state files, you might need hundreds of servers to maintain every pipeline.

These tasks underline the complexity and responsibilities involved in managing infrastructure deployments with Terraform OSS. A great example that showcases how complex these responsibilities can get is outlined in this article by the Slack engineering team. In this case, Slack chose to build vs. buy (TFC) and while it did work for them, it created a large amount of complexity, accidental or otherwise.

Leveraging Terraform Cloud

Hopefully, you can now see how a more effective approach is to use Terraform Cloud from the start. The defaults set by the HashiCorp team allow you to save considerable time and focus on collaboration.

Terraform Cloud removes many complexities associated with maintaining state files in a multi-team, collaborative environment. This is done by abstracting away the responsibility of handling and securing the state file, eliminating the need for manual management of a centralized server, or set of load-balanced servers. Access control mechanisms provide enhanced security, ensuring only those with appropriate permissions can view or edit the state.

Teams may also need to establish a self-service platform for developers, contemplate best practices for operational patterns, and explore tools that can provide visibility into the cost of infrastructure deployment. These are all managed by the Terraform Cloud service with features like cost estimation and no-code provisioning.

Additionally, TFC enables versioning and backup capabilities, ensuring a reliable history of infrastructure configurations, and simplifying auditing using policy as code with Sentinel or OPA. By leveraging Terraform Cloud, teams can focus on collaboration and configuration, knowing that the state file management is handled efficiently and securely.

From day one, teams can simplify state file management, strengthen security, streamline collaboration, and benefit from versioning and backup capabilities. Contrast this to the upfront work required to do the same with Terraform Community, it stands as the easiest solution for organizations seeking efficient and secure infrastructure management in a collaborative setting.

HashiCorp Terraform Cloud is the fastest way to adopt Terraform, the world’s most widely used multi-cloud provisioning product. Offered as a service, Terraform Cloud provides everything practitioners, teams, and global businesses need to create and collaborate on infrastructure and manage risks for security, compliance, and operational constraints.

Further reading

  1. How to Migrate to Terraform Cloud and Why You Should Do It
  2. Managing Terraform Cloud with Terraform
  3. Increasing Deployment Velocity at Scale Factory
  4. Automate Terraform Cloud Workflows
  5. Drivvn Automates Kafka with Terraform-based GitOps

--

--