Picking the Best IaC CI Platform
Much like the black plague or TikTok, infrastructure-as-code (IaC) has dominated the DevOps industry, with Terraform as the clear winner (keep dreaming Pulumi). Unlike cloud-specific IaC libraries (e.g. CloudFormation or Resource Manager), cloud-agnostic IaC libraries have options when it comes to determining their continuous integration processes and tooling.
According to my research, Terraform Cloud is no longer best-in-class.
For a long time, the incumbent in this space was Terraform Cloud (run by the creators of Terraform). As the creators of the language, many people assume they also have the most feature-rich Terraform CI platform. However, Terraform Cloud is already overshadowed by other players in this space, in terms of features, user experience, and security.
After spending a significant amount of time meeting with various vendors, reading documents, and trying the products, my conclusion is as follows.
The purpose of this post is to put a spotlight on the key differentiators between these four platforms. Each of them fulfills a niche in the market. My goal in this post is to provide you a jumping off point to engage with these vendors. Feel free to contact me on LinkedIn if you want a direct introduction.
If this post is helpful, please 👏 or subscribe to let me know. I write for the fun of it, but it’s nice to know that someone found value out of it.
This post starts by comparing the four platforms using a feature overview. It then dives deeper into each of the platforms to discuss what makes them special. This information can help you to choose the CI platform that best suits the needs of your application.
Boring features that every Terraform CI platform must have:
- a reliable SaaS offering, including a free trial;
- VCS integrations into all the common tools;
- standard Terraform operations, such as plan, apply, and refresh;
- horizontally scalable Terraform execution done in a performant manner;
- an extensive UI and API and a Terraform provider for IaC inception;
- the basic security essentials such as SAML, RBAC, audit logging, and so on; and
- notification integrations for workspace events.
It is important to note here that if these “boring” features aren’t implemented well, it negates all the value of the “interesting” features. Unfortunately, every vendor will claim that their product will scale, and the only way to find out if this is true is to run an extensive trial that mirrors what you expect your workload to be.
Interesting features that are worth comparing include the following:
- On-premise support: Is there a fully self-hosted option?
- Non-TF IaC support: Can the platform support other integrations, like Pulumi, Helm, or CloudFormation?
- Policy enforcement: Is there a rules engine to prevent insecure infrastructure from being deployed?
- Chained workspaces: Can workspaces be applied in a flow from dev → staging → production?
- Terraform module registry: Can I treat the platform as my Terraform module registry?
- Self-service infrastructure: Can I set up a marketplace for developers to self-launch infrastructure?
- Customizable build steps: How configurable are the underlying CI commands?
- Cost calculation: Are infrastructure costs (predicted and actual) integrated into the product?
- Infrastructure visibility: Are there visualizations of the deployed infrastructure?
It’s worth mentioning that just because a product supports a feature, doesn’t mean it does it well. A great example is HashiCorp’s proprietary language policy language, Sentinel, which is both difficult to write and prone to errors. If you are shopping for a new Terraform CI tool, then you should get demos from multiple vendors.
⚠️ — Terraform cloud implements their customizable build steps using a “run tasks” feature that is significantly more limited than the others.
The following is what I personally consider each product to be best at.
Best UI Design: Spacelift
Hands down, Spacelift has the best UI. Each page is streamlined with a consistent theme, and the information is correctly minified to show the most critical information and hide the rest. The features were very intuitive to find with the most effective in-app tutorial I’ve ever used. If nothing else, go check out the free trial and see what I mean by the excellent user experience.
Best Data Model: Env0
Env0 has a very good data model that makes it easy to divide things into organizations, projects, templates, and environments. This is especially useful in self-service and limited-control workflows in which the infrastructure team wants to empower downstream teams.
Most Security-Centric Features: Scalr
If you’ve ever used HashiCorp Sentinel, you’ll be delighted at how Scalr makes it easy to not just create policies, but to test and verify in bulk. In addition, it also has the best RBAC system and is the only platform that supports fully customized roles. It also supports a fully self-hosted feature, which is a requirement for many organizations. If you are a security-conscious company, you should absolutely get a demo.
Most Comprehensive: Cloudify
Cloudify can do everything, which comes at the cost of complexity. Cloudify makes the most sense for large organizations with many different audiences and complex, multi-technology deployments. Want to have a CloudFormation template joined with a Terraform environment with Ansible for service orchestration? No problem!
Optional Reading: Product Spotlights
My goal in these posts is to be reasonably comprehensive which means including details beyond those in the feature table above. Think of this section like an appendix or a part 2: it is optional and only worth reading for those who need more information.
Each section includes my overall recommendation, a high-level summary of why each offering is compelling, and a couple of paragraphs on each of the major differentiating features. Where applicable, I’ve included screenshots to provide you a visual of what those features look like.
Product Spotlight: Spacelift
Overall Verdict: Spacelift is excellent SaaS for organizations with centralized infrastructure teams.
Spacelift is probably the best SaaS Terraform CI platform for users already familiar with Terraform. What impressed me most in the demo was the thoughtfulness put into each feature. Its user experience is exceptional, making the experience of using the features just that much better. The features seem targeted to DevOps engineers who are looking for better visibility into their entire infrastructure-as-code process.
Features that Differentiate Spacelift:
Spacelift is unique in its emphasis on infrastructure visibility. This reminds me of the early Datadog days. Spacelift pulls out the Terraform resources and graphs them, allowing users to sort and filter by different properties. The ability to quickly search for all usages of a resource type or cloud provider, or the scope of a workspace, is pretty cool to see. I believe that this visualization could prove useful for many workflows.
Spacelift has spent a lot of engineering power on building a visualization layer for chained executions and their triggers. It’s specifically designed for cases where a single git commit triggers a series of downstream workspaces, which, in turn, trigger more workspaces. The timeline page shows a clear pipeline where you can follow this process. This unique feature of Spacelift is invaluable for adding clarity to the current state.
Spacelift also includes first-order support for TF modules and testing. The idea is that when a change is made to a module, a fresh environment is created and destroyed to validate the module. The module isn’t released to downstream consumers until it passes the tests, which is unique among all the IaC tools I’ve looked at. If you combine this with the chained workspace support, you get a clean visualization of a change to an abstract module, and all the affected downstream builds.
Product Spotlight: Env0
Overall Verdict: Env0 is great for DevOps teams trying to deliver self-service infrastructure.
Env0 is optimized for medium-to-large organizations, especially those with stronger divides between infrastructure and development teams. Much of Env0 is designed to handle the complexities that arise as an organization scales, such as self-service, ACL control, customization, and cost management.
For example, a core DevOps team can create templates and grant downstream teams the ability to launch from those templates. It can also set variables at the higher levels that are inherited from organizations, to projects, to templates, and to environments (the actual infrastructure). This is a subtle feature that can provide significant leverage in bootstrapping new environments with defaults like credentials, account numbers, billing codes, or team tags.
Env0 also supports customization of the build steps, enabling business specific commands to be executed in place of the defaults (which again, are more prevalent in medium-to-large organizations). However, its lack of on-premise support means that it won’t work for the more security-conscious enterprises.
Features that Differentiate Env0:
Env0’s killer feature is the ability for a DevOps team to create a template Terraform workspace and enable downstream teams to self-service these templates. Think of this as having a default Terraform module for an AWS RDS module that includes backup policies and KMS encryption. You can use Env0 to make that offering available to non-infrastructure folks to spin up infrastructure with no Terraform knowledge. This is especially relevant for companies with sales teams spinning up customer demo environments.
Customizable Build Steps
Env0 supports a model similar to CircleCI in that it supports YML files for customizing the execution commands on a per-use-case basis. This means you have the flexibility to interweave custom linting, policies, or cost calculators into the execution step. This can also be used to weave in orchestration frameworks like Ansible, Puppet, or custom scripts.
Env0 has an amazing idea for handling cost calculation: query the billing API directly. It is able to display the actual cost on each environment page using a tagging technique. This is hands-down the cleverest idea I’ve seen in this space. Additionally, if you weave infracost via a customized workflow, you can see the predicted cost before you apply and the actual cost after you apply.
Product Spotlight: Scalr
Overall Verdict: Scalr is the most security-friendly and the best drop-in replacement for the self-hosted Terraform Cloud
Scalr’s core motto is “to make it very easy for you to centralize the Terraform administration while decentralizing the Terraform operations”. Its goal is to be a tightly scoped IaC platform that is laser-focused on providing the best Terraform experience for medium-to-large organizations. It accomplishes this by pushing the administration provided by the DevOps team to the top layer, with downstream teams consuming the approved modules, credentials, and policies.
Scalr’s application structure is clean, with a robust RBAC and a tiered hierarchy that would satisfy even the pickiest of security teams. Their star feature is its policy framework, and especially with how those policies are modified, validated, and rolled out.
Features that Differentiate Scalr
A Policy Engine that Supports Bulk Validation
Scalr has a strong focus on the development, testing, and visibility of policies being applied across an ecosystem. If you’ve never written an IaC policy before, you’ll know that coming up with your policy is 5% of the work, while the other 95% is iterating on your code until it works. If you push a bad policy into production, you can unintentionally break existing workflows. Scalr put thought into how to handle these problems at scale and created its policy visualization and testing features.
Scalr has a visualization page (shown above) where you can see all the checks and their statuses. A dashboard like this is great for getting a handle on what is happening, and the search feature enables you to slice and dice based on a variety of useful dimensions. Scalr also supports the bulk validation of a new policy against all workspaces. This simple feature is essential when trying to design a new policy, and it’s a major win for Scalr.
Scalr is the only IaC platform that allows you to map individual permissions to a custom role. This is similar to Terraform Cloud’s permission mapping to teams, but it separates out teams vs. roles. This customization can be essential in larger organizations. Especially when you start thinking about self-service infrastructure and reusable Terraform modules, the ability to create a custom role may be essential for your business.
On-Premise and Air-Gapped Support
In the developer/infrastructure automation space, it feels like there are fewer and fewer tools that support fully self-hosted options. However, for something as critical as your core infrastructure, self-hosting is table stakes for most companies with a lot to lose. The other IaC CI tools support run-your-own-agent models, but this isn’t enough for those with highly conservative security postures.
Product Spotlight: Cloudify
Overall Verdict: Cloudify is the most feature-rich but most complex platform. It works best in large enterprises with diverse infrastructure.
Cloudify’s goal is to be a complete infrastructure platform instead of just a Terraform execution platform. In a large enterprise, you are likely to see a very heterogeneous architecture. Newer applications will be developed directly on Kubernetes, with data stores and legacy applications on Terraform. You are likely to also see Ansible used for inter-service orchestration. All of this will be deployed against a variety of virtual solutions, and there will be multiple geo-distributed teams that are all responsible for parts of it.
Cloudify believes that it’s possible to manage this all on one platform, and it has built a very compelling offering. If you have a significantly diverse infrastructure or require an on-premise solution, then you should absolutely try their community edition or contact their sales team for a demo.
Features that Differentiate Cloudify
Cloudify took a very smart approach with their on-premise offering, which enables users to start simple, and add more complexity as their needs scale. It offers three different architectures: single host, compact cluster, and a fully distributed cluster. Having run enterprise software for years, I can say that this approach is ideal because it enables you to experiment quickly but still have the option of a full HA installation when you are ready to commit. Additionally, the option to just download the community edition is compelling.
Non-TF IaC Support
Cloudify supports a plethora of plugins that should be sufficient for most organizations. These include the following:
- infrastructure plugins (AWS, Azure, Google Cloud, StarlingX, vSphere, vCloud, NSX-T, Openstack, Host-Pool),
- orchestration plugins (Ansible, Docker, Helm3, Kubernetes, Terraform),
- configuration plugins (Fabric, Scripts, Netconf, Diamond), and
- utility plugins: (CloudInit, Configuration, Custom workflow, Deployment, File, FTP, Hooks Workflow, REST, Rollback, Scalelist, Secrets, SSH Key, Suspend, Terminal).
For new or smaller organizations, your entire infrastructure might be handled by Terraform and Helm charts, and thus these plugins provide little value. Cloudify’s target market is large organizations, with both new and legacy code and complex deployment requirements. These plugins ensure that Cloudify can always answer “yes” to “can you integrate this workflow?”.
As an example, their infrastructure plugins can automatically inventory the existing infrastructure, without requiring a significant onboarding step. Optimizing the onboarding step is essential when dealing with large (>10,000) person organizations.
Customizable Build Steps
Cloudify has by far has the most detailed options for customizing build steps. Underneath this GUI is a declarative YML file that details all the relationships. You can go into these files and customize every aspect of your build. As mentioned in the previous section, this is where you can interweave plugins for performing cluster operations before, after, and during the Terraform build process. It’s a very powerful idea.
It is also worth noting that customizable build steps is currently fairly complex and a bit daunting. The company is investing in drag-and-drop builders and improved documentation to smooth this process out. The more Cloudify is able to improve this workflow, the more they will win in the market.
Terraform Cloud used to be the dominant player in the IaC CI space, but its competitors have caught up and have surpassed it. Spacelift, Env0, Scalr, and Cloudify each have compelling offerings, and depending on the needs of your organization, any one of them could be the best choice. Is the end of your contract coming up with Terraform Cloud coming up? Maybe it’s time for a fresh bake-off!
My recommendation is to try out all four of these products, as each puts its own spin on what an “ideal” Terraform experience should be. The one thing that was abundantly clear to me after researching all these offerings, was that Terraform Cloud is the least feature-rich among them, so if you are a current customer, a bake-off is absolutely warranted.