Using cert-manager to manage Let’s Encrypt TLS certs and running multiple replicas of Traefik v2.
In my previous post, “Quickstart with Traefik v2 on Kubernetes,” I went over a quick 5-minute end-to-end setup of Traefik, Let’s Encrypt, and Cloudflare to handle HTTPS requests on Kubernetes. While that setup with Traefik CRDs is convenient for automatically creating and renewing certs via IngressRoute definitions, it runs with a single instance of Traefik, meaning that it is not highly available. In other words, Traefik becomes the single point of failure for all ingress traffic to your cluster.
In Traefik v1, there was beta support for clustering / HA mode using a KV store (e.g. Consul, etcd, etc). However, Traefik v2 removed support for storing ACME/Let’s Encrypt certificates in a KV store, citing bugs with the raft consensus algorithm (#4851, #3487, #5047, #3833). Automatic cert management feature moved to TraefikEE, leaving open-source users to either run a non-HA version or implement a custom solution to certificate management.
Traefik documentation recommends using cert-manager as the Certificate Controller and notes limited support for the Ingress Route CRD:
When using the Traefik Kubernetes CRD Provider, unfortunately Cert-Manager cannot interface directly with the CRDs yet, but this is being worked on by our team. A workaround is to enable the Kubernetes Ingress provider to allow Cert-Manager to create ingress objects to complete the challenges. Please note that this still requires manual intervention to create the certificates through Cert-Manager, but once created, Cert-Manager will keep the certificate renewed.
This post walks through how to get around this limitation and run Traefik v2 in HA mode on Kubernetes. I will be using Cloudflare as my DNS provider and ACME challenge solver, but feel free to use any other Let’s Encrypt supported providers.
All of the code is also available on Github:
Traefik v2 removed support for storing ACME/Let's Encrypt certificates in a KV store, citing bugs with the raft…
- Kubernetes Cluster (e.g. GKE)
- Helm v3
- DNS provider (e.g. Cloudflare)
We will deploy Traefik to
$ kubectl create namespace traefik
Now let’s deploy Traefik with 3 replicas. You can see the values in
$ helm repo add traefik https://containous.github.io/traefik-helm-chart$ helm install -n traefik traefik traefik/traefik -f traefik/traefik-values.yaml
Wait for the deployments to come up and make note of the Load Balancer IP.
Cert-manager is an open-source tool to automate the issuance and renewal of TLS certificates:
We will install it in the namespace
$ kubectl create namespace cert-manager
Add the Jetstack Helm repo and install CRDs:
$ helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v0.16.0 \
Wait for all the cert-manager pods to come up:
$ kubectl get pods -n cert-manager -w
Deploy an Application
For the sake of the demo, we will deploy the
whoami app in the
default namespace (see under
whoami directory for deployment, service, and ingress files). You can replace this with your application or well-known Helm chart (e.g. Grafana, Kibana, etc).
whoami.example.com with your FQDN and deploy:
$ kubectl apply -f whoami
In order to issue new certificates, we need to first define an Issuer. In this example, I’ll be using Cloudflare for ACME Issuer type, using Let’s Encrypt’s staging server. You can also find other supported configurations (SelfSigned, CA, Vault, Venafi, and External Issuer Types) on the documentation.
solvers sections in
certs/issuer.yaml. To use Cloudflare as DNS01 challenge solver, first create a new API token with the following settings:
Zone - DNS - Edit
Zone - Zone - Read
Include - All Zones
Mount the token as a Kubernetes secret:
$ kubectl create secret generic cloudflare-token --from-literal=dns-token=<my-api-token>
Finally, configure the certificate (modify the
dnsNames as needed in
certs/whoami-cert.yaml) and deploy:
$ kubectl apply -f certs
Set Up DNS
Check if the certificate has been generated:
$ kubectl describe certificate whoami-cert
You can also look at Traefik’s debug logs to watch the cert become active.
Finally, point the DNS record to the IP address of the Load Balancer to see a TLS enabled site backed by HA Traefik + cert-manager. Optionally, you can deploy the HTTPS redirect middleware for completeness.
Now we have a HA deployment of Traefik on Kubernetes. The downside to using cert-manager is that the user must now remember to create the cert before deploying the IngressRoute, but achieving HA is more important in production to avoid downtime.