Why is not a good idea to manage Helm releases with Terraform, Helmfile is a better fit

Pelasilveira
5 min readJun 29, 2024

--

Intro

In general, as a DevOps/Cloud Ops/SRE/whatever team that works with Kubernetes, we can define the following goals for provisioning infrastructure:

  • Manage 2 big groups of resources: Infrastructure Services / Business Apps
  • Manage multiple kinds of resources, including CRDs
  • Manage dependencies, dependencies graphs
  • Manage complexity, ensuring maintainability, scalability, agility, flexibility while reducing human errors
  • Security: Compliance & Best Practices
  • Manage multiple environments, maintaining consistency and personalization
  • Diff Feature on changes
  • Version Control / DRP / Managed State / Rollback Features / Traceability / CICD integrability

This list was defined at the drop of a hat, but it accomplishes the purpose of providing some background and framework for this analysis.

Why Not

Terraform is the “de facto” standard for managing infrastructure as code.

That, is a well known statement.

With it’s providers, interact directly with an API. That way can easily plan the execution of the declarative code and detect drift.

BUT, if you are considering to install helm releases with this tool, please read this:

  • The effectiveness and capabilities of a solution depend on the maturity of the provider.
  • In the case of the Helm provider, it lacks several features, such as the ability to perform diffs.
  • Helm adds an extra layer because it uses a binary (Helm) between Terraform and the Kubernetes API.
  • Like Terraform, Helm has its own state stored in a Kubernetes secret (of type helm.sh/release.v1), this duplicated state adds unnecessary complexity.
  • The complexity added by Terraform is not compensated by additional features, as templating is available in several other tools.
  • If you need to leverage Terraform outputs can be queried externally.
  • Having infrastructure and application releases sharing the same code base makes maintainability harder.
  • The shared lifecycle requires planning and applying unnecessary resources, which can pose a risk or, at the very least, slow down the process.
  • Using Terraform adds more components to maintain, such as Terraform binaries, provider versions, and module versions.
  • The definition of releases is harder to understand with HCL on top of Helm releases, as they use different languages (HCL vs. Go templates).
  • The complexity of managing dependencies increases.
  • Provisioning Kubernetes resources using the Helm provider requires authentication against a Kubernetes cluster. Generally, this resource is managed in the same state as Helm releases, which can lead to errors if not handled correctly.

When and Why: orchestrating resources

In the maturity process of a team or organization, there are various levels and circular paths to consider. If you are already using Infrastructure as Code (IaC), you might approach it as follows:

Firstly, BUILD and OPERATE (Architect → Code → Automate): When multiple environments or instances of our product emerge, we initially focus on architecting, coding, and automating processes. This lays the foundation.

Next, as complexity grows with multiple environments and instances, we move to ORCHESTRATE (Orchestrate → Reuse → Reduce): To manage this complexity effectively, tools like Helm or Kustomize for Kubernetes, or Terraform for infrastructure, become invaluable. They allow us to orchestrate deployments, reuse code, and reduce linear complexity.

Additionally, when working with Kubernetes, it’s common to manage more than one cluster. Each cluster undergoes frequent lifecycle updates, adding complexity due to the management of multiple environments, namespaces, and the continuous integration of new services and features.

Presenting Helmfile

Surely, many of you know this tool:

Helm: CNCF graduated k8s package manager with rollback, template and diff features.

Helmfile: Orchestrator for Helm. Declarative, advanced templating, plugins (diff, etc.), management for multiple environments, support for remote git modules, live project with an active community, support for argo, compatibility with kustomize, secret and tf state support and others features.

Helmfile Features

Setup — opinionated approach

This is an opinionated folder structure, in a real world example each of the could be a remote repository:

root
├── centralized
├── charts
├── live

The entrypoint is live, charts is a folder for in-house charts and centralized is the place for store our reused code.

The structure of live is almost the same for centralized.

Inside, we can organize all releases in a “monofolder” structure or have one folder for namespace.

Inside live we can arrange something like this:

├── live
│ ├── helmfile.d
│ │ └──helmfile.yaml
│ └── values
│ │ ├── release-values.yaml
│ │ ├── values
│ └── variables.env

Variables.env is optional, we will review this in the simplest use case.

Workflow implies exporting the content of this file as environments variables.

helmfile actions will performed against every file in helmfile.d folder.

Kube context must be selected, and you must be atuhenticated to k8s.

This is a simple case with env vars:

###helmfile.yaml####
repositories:
- name: metrics-server
url: https://kubernetes-sigs.github.io/metrics-server/

Releases:
- name: metric-server
namespace: kube-system
version: 3.12.1
chart: metrics-server/metrics-server
values:
- ../values/metrics-server/metrics-server.yaml.gotmpl
###variables.env####
METRICS_SERVER_IMG_TAG=v0.6.3
METRICS_SERVER_REPLICAS=2

###values/metrics-server.yaml.gotmpl###
replicas: {{ requiredEnv "METRICS_SERVER_REPLICAS" }}
Image:
tag: {{ requiredEnv "METRICS_SERVER_IMG_TAG" }}
apiService:
create: true
podDisruptionBudget:
enabled: true
maxUnavailable: 50%
If you have already deploy this release with Terraform and values are correctly set you may run helmfile dif and see any changes

Helmfile files

You can use helmfiles as modules in remote repositories and using tags:

From official Read-the-docs

Environment Management

With this tool we can manage different environments, defining special values files and leveraging the use of .Environment.Name value

In this case we overwrite some values using a new values file, that is dynamically referenced depending on the environment. This is the calling command:

helmfile diff -environment demo
### ../values/environments/demo/global.yaml.gotmpl
karpenterVersion: 0.36.2
karpenterIAMRole: arn:aws:iam::1111:role/KarpenterController-1212112

### ../values/environments/demo/karpenter.yaml.gotmpl
settings:
clusterName: {{ .Environment.Name }}-eks

###helmfile.yaml.gotmpl
environments:
demo:
values:
- ../values/environments/demo/global.yaml.gotmpl
test:
prod:

---

repositories:

- name: awskarpenter
url: public.ecr.aws/karpenter
oci: true

releases:

- name: karpenter
namespace: kube-system
version: {{ .Values.karpenterVersion }}
chart: awskarpenter/karpenter
values:
- ../values/environments/{{ .Environment.Name }}/karpenter.yaml.gotmpl
- serviceAccount:
annotations:
eks.amazonaws.com/role-arn: {{ .Values.karpenterIAMRole }

Reference Secretes and Terraform States

Using VALS you can reference secrets in AWS Secret Manager and dozen of others integrations, also terraform state is supported.

Terraform States can be referenced, either locally or remotely, like S3:

public_subnets_0: ref+tfstates3://11111-terraform-state-us-east-1/demo/terraform-network.tfstate/output.public_subnets[0]

Other features

  • In addition to yaml and gotmpl you can refer to hcl files.
  • Extended templating functions SPRIG and others.
  • Dependency installation/deletion using flag needs
  • Support for disabling CRDs validation. For example if you want to DIFF before installing an instance of a CRD that will be installed in the same run.
  • Compatibility with kustomize
  • Hooks
  • Compatibility with ARGOCD

Conclusion

From analyzed, Terraform is a great tool, but for the case of Helm, increase complexity without a measurable value. Helmfile covers the same functionality and even more. You can migrate from Terraform to helmfile with 0 downtime. Surely, you will have to adapt your pipelines.

Resources & References

--

--