Pulumi 1 year later

Olivier Pichon
dzangolab
Published in
7 min readMay 3, 2024

It’s been over 1 year since we migrated from Terraform to Pulumi for our dev/ops and IaC efforts, and I figured it was time to provide some feedback.

Overall, we are pretty satisfied with the move and with Pulumi in general, even though there are a couple of annoyances.

There is no valid evaluation of a technology without a clear understanding of the circumstance under which it will be used, the requirements it has to satisfym, and the goals it is expected to help achieve.

In our case, we are “occasional devops guys”. It is not our primary business activity. We do it to help our customers and to maintain our own internal infrastructure.

Our infrastructure is pretty simple, but for various reasons we tend to reinstall it fairly frequently (our continued growth means our requirements change, we make mistakes and learn, etc.).

Our business is to provide team-as-a-service solutions to startups, and since many are in early-stage, they lack devops capabilities, and we naturally help them out. This service is provided for free, so we have tried to streamline our solution (we don’t support anything fancy like HS — although we were able to support SOC2 compliance). Most importantly, we wanted it to be simple enough that the dedicated developers assigned to our clients would be able to do it themselves easily, without it being too much of a distraction from their main responsibility, which is to write software and build the customer’s app.

One of the reasons for switching to Pulumi was that it supports traditional programming languages, in particular javascript/typescript. Understanding IaC concepts and details about each cloud platform implementation is hard enough; having to learn a new configuration language such as Terraform’s HCL was impractical. In this respect, Pulumi has delivered, and we’ve found that being able to use a programming language with which the developers were already familiar has definitely boosted adoption of IaC.

We’ve found that a critical aspect of devops is to group resources together that live and die together. For example, a server might get destroyed and re-provisioned much more frequently than a floating IP address, and the two therefore need to be managed separately. On the other hand, if you provision an AWS S3 bucket, you likely want to setup an access policy at the same time, and if you destroy the bucket, you also want to destroy the policy.

Terraform could not handle this and required each resource to be provisioned separately. Terraform modules do not address this issue (although things may have changed since we last looked at it a year ago). We used Terragrunt, an add-on to Terraform, which helped somewhat, but added its own layer of complexity.

With Pulumi, on the other hand, this is a breeze. Because IaC in Pulumi is javascript code, we can group together resources with a similar lifecycle in a “project” (a very bad name, as I will discuss later), and they will get provisioned or destroyed together.

We can also create component resources, which are reusable, shareable collections of resources that act as a single resource in a Pulumi project (this is somewhat similar to Terraform modules).

Pulumi cloud is a SaaS where you can store the state of your resources. With Terraform, you first have to provision an AWS S3 bucket and a DynamoDB instance for storing your resources’ state. Having this storage feature available out of the box from Pulumi is hugely convenient. Plus there is a generous free tier, so we don’t have to worry about extra costs, for us or for our clients (this helps adoption by clients too).

Pulumi provides a web-based dashboard where we can view our resources and their state. It’s pretty well made and quite user-friendly.

I nevertheless do have 2 gripes about Pulumi.

The first has to do with the very confusing concept of Output. Let’s say you want to provision an S3 bucket and an IAM bucket access policy. To define your policy, you’ll need the bucket’s arn. But it does not exist until the bucket has been provisioned. Because Pulumi’s IaC is written in common programming languages, these entities are not available at runtime. Essentially, your bucket.arn variable does not exist when you expect it to. I’m sorry if my explanations are unclear, but apparently the issue is confusing to Pulumi engineers too. In any case, you can end up tying yourself in knots trying to figure out what is available or not. You end up with hair-pulling exceptions that are impossible to fix.

Thanks to a very helpful Pulumi engineer named Piers, who shared some boilerplate code which we now use systematically, we can ignore this issue. Still, as an engineer I am not 100% comfortable with a system that I don’t fully understand. It makes me nervous that there may be a fundamental flaw in their design.

The second gripe is ongoing and frankly a constant frustration.

Pulumi has these concepts of organization, project and stack. A project is a collection of resources that get provisioned and destroyed together, according to settings defined in a stack. A stack pretty much maps to what software developers call an “environment”, eg staging or production. Indeed, this is the only usage example for stacks indicated in Pulumi’s own documentation. So why use a different name? Do devops people have a different convention for this? But since Pulumi’s main feature is to use common programming languages, you’d think their target would be software developers; adopting software developers’ own language could have been a good idea. It’s worth noting that Pulumi’s new “ESC” feature introduces the concept of Environments, which are pretty much what software developers expect.

In addition, some resources are not environment-dependent. For example, AWS SSH key-pairs would be provisioned once irrespective of environment (unless you have separate AWS accounts for staging and production). We’ve adopted the convention of using a global stack in this case. But having to shoehorn our code into an unnecessary framework is annoying.

An organization is like an account, in which all projects live. This is fine. But the tuple [organization, project, stack] has to be globally unique. In practice, the organization is fixed, the stacks are one of limited options ( production, stagingor perhaps global), it boils down to the project name being unique. We don’t have an overly complex infrastructure, and yet we constantly run into name collision issues, to the point where we now have to be extremely careful as to how we name our projects.

This is made more confusing as Pulumi cloud recognizes the repo in which the projects reside, and indeed displays it on the dashboard, but:

  • Pulumi ignores the folders in which the project code resides; this sort of makes sense since a Pulumi project is defined in a Pulumi.yaml file. This does allow the project name to be different from the folder name, which is useful sometimes.
  • Pulumi ignores the repo itself to uniquely identify the project. The repo is used only for visually grouping projects in the Pulumi cloud app.

This is all the more infuriating as they seem to be very close to a much better solution, essentially by shifting the concepts of project and stack up one level, and introducing the concept of environments.

A project would be a collection of stacks, where a stack is a collection of resources that are provisioned and destroyed together (ie equivalent to the current project). Environments are exactly what you’d expect, ie staging, production, etc. (ie equivalent to the current stack).

The tuple [organization, project, stack] would have to be globally unique, but this allows stacks to share the same name as long as they belong to different projects. So we could have cloudflare-dns stack or a aws-vpc stack in multiple projects. I know it sounds like a small thing, but what’s silly to me is why this is not possible today.

The new project concept would allow 2 things: project-level config, with settings shared across all stacks (eg the AWS region); and the environment to be set (at runtime) across all stacks in the project, and possibly for some settings to be shared across all environments (ie an environment-specific config would override a environment-independent config, similar to the way dotenv handles .env, .env.prod, .env.staging files).

In conclusion, we are happy with our choice of Pulumi over Terraform. It has met all our requirements.

Would I recommend Pulumi over Terraform? If you are doing devops full-time, on a large scale, I am not in a position to answer. But if you are, like us, doing devops occasionally, Pulumi’s support for common programming languages makes it a valuable tool, easy to adopt, flexible to use, and just as powerful.

Piers from Pulumi was kind enough to respond to me, and I thought it best to include his comments:

Inputs and Outputs are usually the first big road bumps that people hit on their Pulumi journey. The fact that these have been part of Pulumi since very early on and every so often we’ll take a look and try to find a better solution but have yet to come up with one suggests that this is the best solution for this problem right now. We iterate over our docs on this regularly because we’ll always find a better way of explaining it over time and hopefully make it easier for people to understand.

A bit of Pulumi history: in the beginning, stacks were called “environments” but because that was a word regularly used and different people and companies had different meanings it meant that we had to come up with something else. An environment might mean a “dev” section of an AWS account. It might mean a whole AWS account. It might sit across multiple AWS accounts. Since you might have one or more stacks making up one of those environments, it meant that if we started talking about environments there would be confusion over what we, and our users, meant. And yes you’re quite right to point out the E of ESC stands for “Environments”. That’s the whole principle of that part of the product (and one in fact that solves the problem that you go on to talk about, having config that covers multiple projects and stacks and can inherit from multiple one or more environments).

--

--

Olivier Pichon
dzangolab

Tech entrepreneur. CTO at dzango tech accelerator.