Kubernetes w/ Let’s Encrypt & Cloud DNS

This may well have been done to death but I’d not used an automated approach to generating certs for Kubernetes Services and cert-manager acknowledges that its documentation needs refinement:

https://cert-manager.readthedocs.io/en/latest/tutorials/acme/dns-validation.html

I happened upon Ross Kukulinski’s post “Let’s Encrypt Kubernetes” but the long-promised “Part 2” using Let’s Encrypt has not been published. So, this post is (a chunk of) that.

In full disclosure, I did not (possibly mistakenly) use Kelsey Hightower’s Kubernetes Cert Manager. If I can find the time, I’ll use that too and update this post.

So, here’s how I was able to use Jetstack’s cert-manager to generate certs using Let’s Encrypt, using Cloud DNS and Kubernetes Engine to secure (w/ caveats) a trivial Golang http service.

Caveats: I used the Let’s Encrypt staging not production service; I port-forward the service locally rather than expose it publicly to test the result. When I have time, I’ll address both of these limitations.

Update 18–05–17

The post now describes the solution 4-ways:

  • Self-signed certs
  • Network Load-Balancer w/ TLS backend
  • HTTPS Load-Balancer w/ non-TLS backend
  • HTTPS Load-Balancer w/ TLS backend.

Setup

Grab yourself a fresh Kubernetes (Engine) and authenticate to it. Give yourself too-broad (**don’t do this for anything other than personal development/testing**) RBAC creds:

PROJECT=[[YOUR-PROJECT]]
REGION=[[YOUR-REGION]]
CLUSTER=[[YOUR-CLUSTER]]
BILLING=[[YOUR-BILLING]]
LATEST=[[SEE-NOTE-BELOW]]
gcloud projects create $PROJECT
gcloud beta billing projects link $PROJECT \
--billing-account=$BILLING
gcloud services enable container.googleapis.com \
--project=$PROJECT
gcloud beta container clusters create $CLUSTER \
--username="" \
--cluster-version=${LATEST} \
--machine-type=custom-1-4096 \
--image-type=COS \
--num-nodes=1 \
--enable-autorepair \
--enable-autoscaling \
--enable-autoupgrade \
--enable-cloud-logging \
--enable-cloud-monitoring \
--min-nodes=1 \
--max-nodes=2 \
--region=${REGION} \
--project=${PROJECT} \
--preemptible \
--scopes="https://www.googleapis.com/auth/cloud-platform"
gcloud beta container clusters get-credentials $CLUSTER \
--project=${PROJECT} \
--region=${REGION}
kubectl create clusterrolebinding $(whoami)-cluster-admin-binding \
--clusterrole=cluster-admin \
--user=$(gcloud config get-value account)
NB I wrote recently about ways to automate the determination of the most recent (or latest) master|node versions. Use your preferred version or check out the 3 ways described in that post to determine the value of ${LATEST}.

Make a copy of Ross’ Golang code or — if you’re comfortable doing so — use his Docker image (rosskukulinski/secure-go-app). I’ve not used his image and cannot attest to it. Or, if you’re comfortable doing so, grab my Docker image (dazwilkin/securego).

NB Ross’ image uses port 443 whereas mine uses 8443 which is just a preference.

Self-signed Cert

I like Ross’ approach. He builds incrementally. Customarily, I would run the code locally, then locally but containerized, then deployed to Kubernetes. But, let’s cut to the chase and leave that for your homework. I recommend, if you do, that you consider parameterizing the location of the cert and the key.

So, we’ll create a self-signed cert first, deploy this to Kubernetes as a secret and then deploy the image mounting the secret as a volume (on /secure/)

PREFIX=securego
DOMAIN=[[YOUR-DOMAIN]]
NAME=${PREFIX}.${DOMAIN}
openssl req \
-x509 \
-nodes \
-days 365 \
-newkey rsa:2048 \
-keyout ${NAME}.key \
-out ${NAME}.crt \
-subj "/CN=${NAME}"
NB You’ll need access to a domain for this all to work. There are many ways to acquire these. I use Google Domains and — last time I checked — Google (my employer) didn’t actually make it trivial to use these domains in conjunction with Google Cloud DNS. This post assumes you’re using Cloud DNS to manage your domain but this is not a requirement of the tooling.

I’m making work for myself but I like to keep my X509 certs and keys clearly named by their domain so I can differentiate them. When we create the Kubernetes secret for the cert and key, it expects the files to be called tls.crt and tls.key so note the — from-file mappings in the command below:

kubectl create secret generic selfsigned \
--from-file=tls.key=./${NAME}.key \
--from-file=tls.crt=./${NAME}.crt

Now we just need a deployment spec that combines the image, its port and our certificate and key:

and you may:

kubectl apply --filename=securego.deployment.yaml

Rather than expose the Deployment as a Service, we can use kubectl to port-forward to the pod that was created by the deployment. We may then curl against the forward port locally:

POD=$(\
kubectl get pods \
--selector=app=securego \
--output=jsonpath="{.items[0].metadata.name}")
kubectl port-forward ${POD} 8443:8443

and, from a separate shell:

curl --insecure https://localhost:8443

If you’re using my image, you should receive:

Hello Henry!

cert-manager

You may deploy cert-manager using Helm. Helm is the de facto “package” manager for Kubernetes. It is excellent and, if you’ve not done so already, I recommend you evaluate it. I’m lazy and so I deployed cert-manager directly using the Kubernetes specs:

https://cert-manager.readthedocs.io/en/latest/getting-started/2-installing.html#with-static-manifests

go get github.com/jetstack/cert-manager
cd ../jetstack/cert-manager/
cd contrib/manifests/cert-manager/rbac
kubectl apply --filename=.
NB: If you relied on the Docker images in the previous step i.e. you did not use the Golang code previously, you’ll need to set up a Golang workspace.

All being well, cert-manager should have created a Namespace called cert-manager containing a Deployment called cert-manager:

cert-manager

We must now configure cert-manager to automate the creation of certificates for us. This requires configuring a(n) [Cluster]Issuer and a Certificate. This is the part that took the most ‘twiddling’ in my case so I’ll try to explain what you need and where.

I’m using a ClusterIssuer which will provision Certificates for any Namespace in the cluster. I think you should use this too.

NB In my case, the project containing my Cloud DNS zones is different than the one containing the Kubernetes Engine cluster. This is an entirely possible scenario.
NB Provide a legitimate email where Let’s Encrypt may reach you.
NB privateKeySecretRef is used by cert-manager. You do not need to create the secret called letsencrypt-staging for yourself. I recommend you leave this as-is.
NB You do need to create the clouddns serviceAccountSecretRef entries (see below).

cert-manager uses a service account to add|delete entries to DNS to prove your ownership of the domain to the Let’s Encrypt service. To do this, we must create a service account, assign it appropriate permissions and expose the service account to the cert-manager Deployment (as a Kubernetes Secret).

ROBOT=clouddns
DNS=[[YOUR-CLOUD-DNS-PROJECT]]
gcloud iam service-accounts create ${ROBOT} \
--display-name=${ROBOT} \
--project=${DNS}
gcloud iam service-accounts keys create ./${ROBOT}.key.json \
--iam-account=clouddns@${DNS}.iam.gserviceaccount.com \
--project=${DNS}
gcloud projects add-iam-policy-binding ${DNS} \
--member=serviceAccount:${ROBOT}@${DNS}.iam.gserviceaccount.com \
--role=roles/dns.admin
kubectl create secret generic clouddns \
--from-file=./clouddns.key.json \
--namespace=cert-manager
NB In the above the value of DNS may well be the same value as PROJECT; if ${DNS}==${PROJECT}, that’s fine. If your DNS project is different, ensure you set the values correctly.

Now that the service account is accessible to cert-manager, you may deploy the Issuer:

kubectl apply --filename=certmanager.issuer.yaml

You can use the Kubernetes Workloads browser in Google Cloud Console to view log entries for the Pod (!) that underlies the cert-manager Deployment. If you check the logs, you should see something similar to the following:

cert-manager’s Pod (success) logs

Now we may apply the Certificate request to the cert-manager Issuer. This is where the rubber hits the road and — all being well — our certificate will not only be created but it will also be configured as a Kubernetes Secret, ready for us to use. Here’s the Certificate spec:

NB In my case, the value of[[YOUR-DOMAIN]] is dazwilkin.com. This file repeats the [[YOUR-DOMAIN]] entry multiple times and all may not be needed. This is the configuration that worked for me.

All being well, you should now have a Kubernetes Secret called securego.

kubectl get secrets/securego
NAME TYPE DATA AGE
securego kubernetes.io/tls 2 2h

You may dig deeper by. Here’s my result with redactions:

kubectl describe certificate/securego-dazwilkin-com
Name:         securego
Namespace: default
API Version: certmanager.k8s.io/v1alpha1
Kind: Certificate
Metadata:
Cluster Name:
Creation Timestamp: 2018-05-16T21:54:22Z
Generation: 0
Resource Version: 49010
Spec:
Acme:
Config:
Dns 01:
Provider: clouddns
Domains:
securego.dazwilkin.com
Common Name: securego.dazwilkin.com
Dns Names:
securego.dazwilkin.com
Issuer Ref:
Kind: ClusterIssuer
Name: letsencrypt-staging
Secret Name: securego
Status:
Acme:
Order:
Conditions:
Last Transition Time: 2018-05-16T21:56:56Z
Message: Certificate issued successfully
Reason: CertIssued
Status: True
Type: Ready
Last Transition Time: <nil>
Message: Order validated
Reason: OrderValidated
Status: False
Type: ValidateFailed
Events: <none>

And, you may revise the securego Deployment to reflect this new certificate:

NB: The singular change is the secretName in line 27. From selfsigned to securego.

You won’t immediately notice any difference because the service is already secured by a certificate. So, let’s check the certificate that we’re being provided *before* we swap to the Let’s Encrypt provided certificate that was obtained for us by cert-manager:

POD=$(\
kubectl get pods \
--selector=app=securego \
--output=jsonpath="{.items[0].metadata.name}")
kubectl port-forward ${POD} 8443:8443

and then, from another shell:

openssl s_client \
-servername securego.[[YOUR-DOMAIN]] \
-connect localhost:8443 \
| less

and you should see something similar to:

verify error:num=18:self signed certificate
verify return:1
depth=0 CN = securego.[[YOUR-DOMAIN]]
verify return:1
CONNECTED(00000003)
---
Certificate chain
0 s:/CN=securego.[[YOUR-DOMAIN]]
i:/CN=securego.[[YOUR-DOMAIN]]
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=securego.[[YOUR-DOMAIN]]
issuer=/CN=securego.[[YOUR-DOMAIN]]

NB It reportsself signed certificate. You should also see your domain prefixed with the securego host that we used at the beginning.

OK, now you may apply the Let’s Encrypt based deployment:

kubectl apply --filename=securego.deployment.yaml

The Deployment will replace the Pod and so, once this stabilizes, repeat the command that grabs the Pod and does the port-forwarding and then repeat the openssl lookup. This time, you should see something similar to:

CONNECTED(00000003)
---
Certificate chain
0 s:/CN=securego.[[YOUR-DOMAIN]]
i:/CN=Fake LE Intermediate X1
1 s:/CN=Fake LE Intermediate X1
i:/CN=Fake LE Root X1
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=securego.[[YOUR-DOMAIN]]
issuer=/CN=Fake LE Intermediate X1
---

NB the chain now includes Fake LE Intermediate X1 and Fake LE Root X1. These are legitimate certs used by Let’s Encrypt’s staging (!) environment. Also that the issuer is Fake LE Intermediate X1.

Network LB & Cloud DNS

Okay, I mentioned at the top of the post that I’d not tested this through a Kubernetes Service and public Load-Balancer. So, let’s do that.

NB As you may realize, the Docker containers backing our Kubernetes service expect traffic over TLS. A L7|HTTPS Load-Balancer terminates a TLS connection. It could then re-encrypt traffic for its backends but this is not what we’ve built. Instead we must a Network Load-Balancer as this will simply route traffic to the backends without terminating TLS. Network LBs are represented by — type=LoadBalancer in Kubernetes.

There are 3 steps. First we must expose the Service as — type=LoadBalancer. Second we must configure our DNS with the endpoint of the Load-Balancer that’s configured for us. Third we curl and openssl lookup the service’s certificate.

kubectl expose deployment/securego \
--port=443 \
--target-port=8443 \
--type=LoadBalancer
NB We can remap the ports with the service. In the above, I’m exposing the service on port 443 even though the container(s) are on port 8443.

This results in a Network LB

And, an endpoint (public IP) that must be configured in the DNS service to alias to securego.[[YOUR-DOMAIN]]. In my case, 35.233.171.159 is mapped to securego.dazwilkin.com in Cloud DNS:

Once the DNS entry is changed we must await its propagation, you can periodically check for it to update with:

nslookup securego.[[YOUR-DOMAIN]] 8.8.8.8

Relatively quickly, I receive:

Server:         8.8.8.8
Address: 8.8.8.8#53
Non-authoritative answer:
Name: securego.dazwilkin.com
Address: 35.233.171.159 # Your IP will differ

And then, you may:

curl --insecure https://securego.[[YOUR-DOMAIN]]
Hello Henry!

And, more usefully:

openssl s_client \
-servername secure.dazwilkin.com \
-connect 35.233.171.159:443 # Your IP will differ

Logs

You may wish to review logs from the command-line:

FILTER="resource.type=\"container\" resource.labels.cluster_name=\"letsencrypt-01\" resource.labels.namespace_id=\"cert-manager\""
gcloud beta  logging read "$FILTER" \
--order=asc \
--project=$PROJECT \
--format="json" \
| jq --raw-output .[].textPayload

Here’s a successful run using my configuration:

Calling GetOrder
Calling GetAuthorization
Calling DNS01ChallengeRecord
Cleaning up old/expired challenges for Certificate default/securego-dazwilkin-com
Calling GetChallenge
Checking DNS propagation for "securego.dazwilkin.com" using name servers: [10.19.240.10:53]
Waiting DNS record TTL (60s) to allow propagation of DNS record for domain "_acme-challenge.securego.dazwilkin.com."
ACME DNS01 validation record propagated for "_acme-challenge.securego.dazwilkin.com."
Accepting challenge for domain "securego.dazwilkin.com"
Calling AcceptChallenge
Waiting for authorization for domain "securego.dazwilkin.com"
Calling WaitAuthorization
Successfully authorized domain "securego.dazwilkin.com"
Cleaning up challenge for domain "securego.dazwilkin.com" as part of Certificate default/securego-dazwilkin-com
Issuing certificate...
getting private key (letsencrypt-staging->tls.key) for acme issuer cert-manager/letsencrypt-staging
Calling GetOrder
Calling FinalizeOrder
successfully obtained certificate: cn="securego.dazwilkin.com" altNames=[securego.dazwilkin.com] url="https://acme-staging-v02.api.letsencrypt.org/acme/order/6100788/1013320"
Certificate issued successfully
Found status change for Certificate "securego-dazwilkin-com" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2018-05-16 21:56:56.842909838 +0000 UTC m=+7058.901958015
Certificate default/securego-dazwilkin-com scheduled for renewal in 1438 hours
certificates controller: Finished processing work item "default/securego-dazwilkin-com"
certificates controller: syncing item 'default/securego-dazwilkin-com'
Certificate default/securego-dazwilkin-com scheduled for renewal in 1438 hours
certificates controller: Finished processing work item "default/securego-dazwilkin-com"

Conclusion

cert-manager warns that this functionality is not yet ready for production deployments but, combining Kubernetes with mechanisms to auto-generate and manage certificates is terrific! You’ve no excuse to not be using legitimate (non-staging Let’s Encrypt or your preferred CA) certs for all your workloads.

Feedback is always sought

That’s all!

18–05–17 Update:

HTTPS Load-Balancer w/ non-TLS backend

In this variant, we remove TLS from the Kubernetes Deployment and apply the certificate generated by Let’s Encrypt to the Google Cloud HTTPS Load-Balancer.

The simpler main.go is included below along with a Dockerfile that assume’s you’ve built the binary of it. I’ve pushed the image to DockerHub as dazwilkin/nontlsgo. So you may just reference that directly.

Apply the Deployment nontlsgo.deployment.yaml to your cluster. Assuming you’ve retained the cert-manager Deployment from the previous steps, you may reuse the Issuer but you must apply the newcertmanager.certificate.yaml provided here. This Certificate request is for nontlsgo. This is the Certificate we’ll use in the Ingress to provision the HTTPS Load-Balancer. Once you’re able to confirm that the certificate is created, apply the ingress.yaml to provision the GCLB.

I was able to capture Let’s Encrypt challenge being provisioned by cert-manager into Cloud DNS in order that the Let’s Encrypt service is able to confirm my ownership of dazwilkin.com:

Cloud DNS showing Let’s Encrypt’s ACME Challenge

Here’s the log output of the successful Certificate provisoning:

And we’re able to query the Certificate:

kubectl get certificate/nontlsgo
NAME       AGE
nontlsgo 7m

We need to expose our Deployment as a Service of — type=NodePort:

kubectl expose deployment/nontlsgo \
--port=8080 \
--target-port=8080 \
--type=NodePort

And, this time, we’ll provision an Ingress which, in Kubernetes Engine, will provision a Google Cloud HTTPS Load-Balancer.

kubectl apply --filename=ingress.yaml

Here’s the view from the Services column of Cloud Console as the Ingress is provisioning the Google Cloud HTTPS Load-Balancer (GCLB):

We can see this through the Console’s Load Balancer column too:

NB Both HTTP (on port 80) and HTTPS (on port 443) are enabled on a VIP 35.190.27.1. For HTTPS, Kubernetes has also provisioned the nontlsgo Secret as a certificate for the LB.

Once the GCLB is provisioned, we should be able to hit the endpoint:

curl --insecure https://35.190.27.1
Hello Henry!

And, as before, we can inspect the certificate it’s presenting:

openssl s_client \
-servername nontlsgo.[[YOUR-DOMAIN]] \
-connect 35.190.27.1:443 \
| less

And, as before:

CONNECTED(00000003)
---
Certificate chain
0 s:/CN=nontlsgo.[[YOUR-DOMAIN]]
i:/CN=Fake LE Intermediate X1
1 s:/CN=Fake LE Intermediate X1
i:/CN=Fake LE Root X1
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=nontlsgo.[[YOUR-DOMAIN]]
issuer=/CN=Fake LE Intermediate X1

In this case, our backend service is not secured by TLS but we front-end the service with a GCLB that does require TLS.

HTTPS Load-Balancer w/ TLS backend

Aha! There’s some Alpha work to enable support for TLS backends with the GCLB running HTTPS, let’s explore:

https://github.com/kubernetes/ingress-gce/blob/master/README.md#backend-https

It works!

Apologies that my naming will make this appear more confusing than it needs be but what we’re going to do is reuse the configuration of the Ingress from above (with the nontlsgo cert) and this time we’ll use the securego service (using the securego cert) as its backend. So, the GCLB terminates the external TLS connection (using cert nontlsgo) and the GCLB then re-encrypts traffic (as the client) using cert securego to route to the TLS (!) based securego Service.

Reuse everything that came before and use the following service.yaml and ingress.yaml:

NB I remap the ports in the service from the container’s 8443443 for the service. This is because the Alpha Ingress references port 443 and I didn’t want to change too much and break something. I’ll try using 8443 too.
NB The Ingress includes the Alpha annotation that enables this use of TLS for the backend service: service.alpha.kubernetes.io/app-protocols. The value takes the port that we’ve named https-port and we specify HTTPS.

Here’s the GCLB that’s generated by the Ingress:

Secure all the way down!

And, as expected:

curl --insecure https://35.186.202.30
Hello Henry!

And, as expected we’re served the nontlsgo certificate because our interaction is with the HTTPS Load-Balancer. If we could peek behind the green curtain, we’d see the Load-Balancer itself receiving the securego cert when talking to the Kubernetes backend Service.

openssl s_client \
-servername nontlsgo.dazwilkin.com \
-connect 35.186.202.30:443 \
| less

And:

CONNECTED(00000003)
---
Certificate chain
0 s:/CN=nontlsgo.[[YOUR-DOMAIN]]
i:/CN=Fake LE Intermediate X1
1 s:/CN=Fake LE Intermediate X1
i:/CN=Fake LE Root X1
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=nontlsgo.[[YOUR-DOMAIN]]
issuer=/CN=Fake LE Intermediate X1
---

Production

For completeness, let’s remove the need to curl — insecure.

Here are the scripts:

NB The securego Service is running on port 8443. This proves that it’s possible to do this (and not use 443) as questioned previously.

NB The Ingress reuses|assumes the existence of the TLS-based Service (securego on port 8443)

Once the endpoint is available, you will need to update your DNS recordset to reflect it and then:

curl https://tlsprdgo.[[YOUR-DOMAIN]]
Hello Henry!

That’s all!