Kubernetes w/ Let’s Encrypt & Cloud DNS
This may well have been done to death but I’d not used an automated approach to generating certs for Kubernetes Services and cert-manager
acknowledges that its documentation needs refinement:
https://cert-manager.readthedocs.io/en/latest/tutorials/acme/dns-validation.html
I happened upon Ross Kukulinski’s post “Let’s Encrypt Kubernetes” but the long-promised “Part 2” using Let’s Encrypt has not been published. So, this post is (a chunk of) that.
In full disclosure, I did not (possibly mistakenly) use Kelsey Hightower’s Kubernetes Cert Manager. If I can find the time, I’ll use that too and update this post.
So, here’s how I was able to use Jetstack’s cert-manager to generate certs using Let’s Encrypt, using Cloud DNS and Kubernetes Engine to secure (w/ caveats) a trivial Golang http service.
Caveats: I used the Let’s Encrypt staging not production service; I port-forward the service locally rather than expose it publicly to test the result. When I have time, I’ll address both of these limitations.
Update 18–05–17
The post now describes the solution 4-ways:
- Self-signed certs
- Network Load-Balancer w/ TLS backend
- HTTPS Load-Balancer w/ non-TLS backend
- HTTPS Load-Balancer w/ TLS backend.
Setup
Grab yourself a fresh Kubernetes (Engine) and authenticate to it. Give yourself too-broad (**don’t do this for anything other than personal development/testing**) RBAC creds:
PROJECT=[[YOUR-PROJECT]]
REGION=[[YOUR-REGION]]
CLUSTER=[[YOUR-CLUSTER]]
BILLING=[[YOUR-BILLING]]
LATEST=[[SEE-NOTE-BELOW]]gcloud projects create $PROJECT
gcloud beta billing projects link $PROJECT \
--billing-account=$BILLINGgcloud services enable container.googleapis.com \
--project=$PROJECTgcloud beta container clusters create $CLUSTER \
--username="" \
--cluster-version=${LATEST} \
--machine-type=custom-1-4096 \
--image-type=COS \
--num-nodes=1 \
--enable-autorepair \
--enable-autoscaling \
--enable-autoupgrade \
--enable-cloud-logging \
--enable-cloud-monitoring \
--min-nodes=1 \
--max-nodes=2 \
--region=${REGION} \
--project=${PROJECT} \
--preemptible \
--scopes="https://www.googleapis.com/auth/cloud-platform"gcloud beta container clusters get-credentials $CLUSTER \
--project=${PROJECT} \
--region=${REGION}kubectl create clusterrolebinding $(whoami)-cluster-admin-binding \
--clusterrole=cluster-admin \
--user=$(gcloud config get-value account)
NB I wrote recently about ways to automate the determination of the most recent (or latest) master|node versions. Use your preferred version or check out the 3 ways described in that post to determine the value of
${LATEST}
.
Make a copy of Ross’ Golang code or — if you’re comfortable doing so — use his Docker image (rosskukulinski/secure-go-app
). I’ve not used his image and cannot attest to it. Or, if you’re comfortable doing so, grab my Docker image (dazwilkin/securego
).
NB Ross’ image uses port
443
whereas mine uses8443
which is just a preference.
Self-signed Cert
I like Ross’ approach. He builds incrementally. Customarily, I would run the code locally, then locally but containerized, then deployed to Kubernetes. But, let’s cut to the chase and leave that for your homework. I recommend, if you do, that you consider parameterizing the location of the cert and the key.
So, we’ll create a self-signed cert first, deploy this to Kubernetes as a secret and then deploy the image mounting the secret as a volume (on /secure/
)
PREFIX=securego
DOMAIN=[[YOUR-DOMAIN]]
NAME=${PREFIX}.${DOMAIN}
openssl req \
-x509 \
-nodes \
-days 365 \
-newkey rsa:2048 \
-keyout ${NAME}.key \
-out ${NAME}.crt \
-subj "/CN=${NAME}"
NB You’ll need access to a domain for this all to work. There are many ways to acquire these. I use Google Domains and — last time I checked — Google (my employer) didn’t actually make it trivial to use these domains in conjunction with Google Cloud DNS. This post assumes you’re using Cloud DNS to manage your domain but this is not a requirement of the tooling.
I’m making work for myself but I like to keep my X509 certs and keys clearly named by their domain so I can differentiate them. When we create the Kubernetes secret for the cert and key, it expects the files to be called tls.crt
and tls.key
so note the — from-file
mappings in the command below:
kubectl create secret generic selfsigned \
--from-file=tls.key=./${NAME}.key \
--from-file=tls.crt=./${NAME}.crt
Now we just need a deployment spec that combines the image, its port and our certificate and key:
and you may:
kubectl apply --filename=securego.deployment.yaml
Rather than expose the Deployment as a Service, we can use kubectl to port-forward to the pod that was created by the deployment. We may then curl against the forward port locally:
POD=$(\
kubectl get pods \
--selector=app=securego \
--output=jsonpath="{.items[0].metadata.name}")kubectl port-forward ${POD} 8443:8443
and, from a separate shell:
curl --insecure https://localhost:8443
If you’re using my image, you should receive:
Hello Henry!
cert-manager
You may deploy cert-manager
using Helm. Helm is the de facto “package” manager for Kubernetes. It is excellent and, if you’ve not done so already, I recommend you evaluate it. I’m lazy and so I deployed cert-manager directly using the Kubernetes specs:
go get github.com/jetstack/cert-manager
cd ../jetstack/cert-manager/
cd contrib/manifests/cert-manager/rbackubectl apply --filename=.
NB: If you relied on the Docker images in the previous step i.e. you did not use the Golang code previously, you’ll need to set up a Golang workspace.
All being well, cert-manager
should have created a Namespace called cert-manager
containing a Deployment called cert-manager
:
We must now configure cert-manager to automate the creation of certificates for us. This requires configuring a(n) [Cluster]Issuer
and a Certificate
. This is the part that took the most ‘twiddling’ in my case so I’ll try to explain what you need and where.
I’m using a ClusterIssuer which will provision Certificates for any Namespace in the cluster. I think you should use this too.
NB In my case, the project containing my Cloud DNS zones is different than the one containing the Kubernetes Engine cluster. This is an entirely possible scenario.
NB Provide a legitimate email where Let’s Encrypt may reach you.
NB
privateKeySecretRef
is used bycert-manager
. You do not need to create the secret calledletsencrypt-staging
for yourself. I recommend you leave this as-is.NB You do need to create the
clouddns
serviceAccountSecretRef
entries (see below).
cert-manager
uses a service account to add|delete entries to DNS to prove your ownership of the domain to the Let’s Encrypt service. To do this, we must create a service account, assign it appropriate permissions and expose the service account to the cert-manager
Deployment (as a Kubernetes Secret).
ROBOT=clouddns
DNS=[[YOUR-CLOUD-DNS-PROJECT]]gcloud iam service-accounts create ${ROBOT} \
--display-name=${ROBOT} \
--project=${DNS}gcloud iam service-accounts keys create ./${ROBOT}.key.json \
--iam-account=clouddns@${DNS}.iam.gserviceaccount.com \
--project=${DNS}gcloud projects add-iam-policy-binding ${DNS} \
--member=serviceAccount:${ROBOT}@${DNS}.iam.gserviceaccount.com \
--role=roles/dns.adminkubectl create secret generic clouddns \
--from-file=./clouddns.key.json \
--namespace=cert-manager
NB In the above the value of
DNS
may well be the same value asPROJECT
; if${DNS}
==${PROJECT}
, that’s fine. If your DNS project is different, ensure you set the values correctly.
Now that the service account is accessible to cert-manager, you may deploy the Issuer:
kubectl apply --filename=certmanager.issuer.yaml
You can use the Kubernetes Workloads browser in Google Cloud Console to view log entries for the Pod (!) that underlies the cert-manager Deployment. If you check the logs, you should see something similar to the following:
Now we may apply the Certificate request to the cert-manager Issuer. This is where the rubber hits the road and — all being well — our certificate will not only be created but it will also be configured as a Kubernetes Secret, ready for us to use. Here’s the Certificate spec:
NB In my case, the value of[[YOUR-DOMAIN]]
is dazwilkin.com
. This file repeats the [[YOUR-DOMAIN]]
entry multiple times and all may not be needed. This is the configuration that worked for me.
All being well, you should now have a Kubernetes Secret called securego
.
kubectl get secrets/securego
NAME TYPE DATA AGE
securego kubernetes.io/tls 2 2h
You may dig deeper by. Here’s my result with redactions:
kubectl describe certificate/securego-dazwilkin-comName: securego
Namespace: default
API Version: certmanager.k8s.io/v1alpha1
Kind: Certificate
Metadata:
Cluster Name:
Creation Timestamp: 2018-05-16T21:54:22Z
Generation: 0
Resource Version: 49010
Spec:
Acme:
Config:
Dns 01:
Provider: clouddns
Domains:
securego.dazwilkin.com
Common Name: securego.dazwilkin.com
Dns Names:
securego.dazwilkin.com
Issuer Ref:
Kind: ClusterIssuer
Name: letsencrypt-staging
Secret Name: securego
Status:
Acme:
Order:
Conditions:
Last Transition Time: 2018-05-16T21:56:56Z
Message: Certificate issued successfully
Reason: CertIssued
Status: True
Type: Ready
Last Transition Time: <nil>
Message: Order validated
Reason: OrderValidated
Status: False
Type: ValidateFailed
Events: <none>
And, you may revise the securego
Deployment to reflect this new certificate:
NB: The singular change is the secretName in line 27. From
selfsigned
tosecurego
.
You won’t immediately notice any difference because the service is already secured by a certificate. So, let’s check the certificate that we’re being provided *before* we swap to the Let’s Encrypt provided certificate that was obtained for us by cert-manager
:
POD=$(\
kubectl get pods \
--selector=app=securego \
--output=jsonpath="{.items[0].metadata.name}")kubectl port-forward ${POD} 8443:8443
and then, from another shell:
openssl s_client \
-servername securego.[[YOUR-DOMAIN]] \
-connect localhost:8443 \
| less
and you should see something similar to:
verify error:num=18:self signed certificate
verify return:1
depth=0 CN = securego.[[YOUR-DOMAIN]]
verify return:1
CONNECTED(00000003)
---
Certificate chain
0 s:/CN=securego.[[YOUR-DOMAIN]]
i:/CN=securego.[[YOUR-DOMAIN]]
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=securego.[[YOUR-DOMAIN]]
issuer=/CN=securego.[[YOUR-DOMAIN]]
NB It reportsself signed certificate
. You should also see your domain prefixed with the securego
host that we used at the beginning.
OK, now you may apply the Let’s Encrypt based deployment:
kubectl apply --filename=securego.deployment.yaml
The Deployment will replace the Pod and so, once this stabilizes, repeat the command that grabs the Pod and does the port-forwarding and then repeat the openssl
lookup. This time, you should see something similar to:
CONNECTED(00000003)
---
Certificate chain
0 s:/CN=securego.[[YOUR-DOMAIN]]
i:/CN=Fake LE Intermediate X1
1 s:/CN=Fake LE Intermediate X1
i:/CN=Fake LE Root X1
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=securego.[[YOUR-DOMAIN]]
issuer=/CN=Fake LE Intermediate X1
---
NB the chain now includes Fake LE Intermediate X1
and Fake LE Root X1
. These are legitimate certs used by Let’s Encrypt’s staging (!) environment. Also that the issuer is Fake LE Intermediate X1
.
Network LB & Cloud DNS
Okay, I mentioned at the top of the post that I’d not tested this through a Kubernetes Service and public Load-Balancer. So, let’s do that.
NB As you may realize, the Docker containers backing our Kubernetes service expect traffic over TLS. A L7|HTTPS Load-Balancer terminates a TLS connection. It could then re-encrypt traffic for its backends but this is not what we’ve built. Instead we must a Network Load-Balancer as this will simply route traffic to the backends without terminating TLS. Network LBs are represented by
— type=LoadBalancer
in Kubernetes.
There are 3 steps. First we must expose the Service as — type=LoadBalancer
. Second we must configure our DNS with the endpoint of the Load-Balancer that’s configured for us. Third we curl and openssl lookup the service’s certificate.
kubectl expose deployment/securego \
--port=443 \
--target-port=8443 \
--type=LoadBalancer
NB We can remap the ports with the service. In the above, I’m exposing the service on port 443 even though the container(s) are on port 8443.
This results in a Network LB
And, an endpoint (public IP) that must be configured in the DNS service to alias to securego.[[YOUR-DOMAIN]]
. In my case, 35.233.171.159
is mapped to securego.dazwilkin.com
in Cloud DNS:
Once the DNS entry is changed we must await its propagation, you can periodically check for it to update with:
nslookup securego.[[YOUR-DOMAIN]] 8.8.8.8
Relatively quickly, I receive:
Server: 8.8.8.8
Address: 8.8.8.8#53Non-authoritative answer:
Name: securego.dazwilkin.com
Address: 35.233.171.159 # Your IP will differ
And then, you may:
curl --insecure https://securego.[[YOUR-DOMAIN]]
Hello Henry!
And, more usefully:
openssl s_client \
-servername secure.dazwilkin.com \
-connect 35.233.171.159:443 # Your IP will differ
Logs
You may wish to review logs from the command-line:
FILTER="resource.type=\"container\" resource.labels.cluster_name=\"letsencrypt-01\" resource.labels.namespace_id=\"cert-manager\""gcloud beta logging read "$FILTER" \
--order=asc \
--project=$PROJECT \
--format="json" \
| jq --raw-output .[].textPayload
Here’s a successful run using my configuration:
Calling GetOrder
Calling GetAuthorization
Calling DNS01ChallengeRecord
Cleaning up old/expired challenges for Certificate default/securego-dazwilkin-com
Calling GetChallenge
Checking DNS propagation for "securego.dazwilkin.com" using name servers: [10.19.240.10:53]
Waiting DNS record TTL (60s) to allow propagation of DNS record for domain "_acme-challenge.securego.dazwilkin.com."
ACME DNS01 validation record propagated for "_acme-challenge.securego.dazwilkin.com."
Accepting challenge for domain "securego.dazwilkin.com"
Calling AcceptChallenge
Waiting for authorization for domain "securego.dazwilkin.com"
Calling WaitAuthorization
Successfully authorized domain "securego.dazwilkin.com"
Cleaning up challenge for domain "securego.dazwilkin.com" as part of Certificate default/securego-dazwilkin-com
Issuing certificate...
getting private key (letsencrypt-staging->tls.key) for acme issuer cert-manager/letsencrypt-staging
Calling GetOrder
Calling FinalizeOrder
successfully obtained certificate: cn="securego.dazwilkin.com" altNames=[securego.dazwilkin.com] url="https://acme-staging-v02.api.letsencrypt.org/acme/order/6100788/1013320"
Certificate issued successfully
Found status change for Certificate "securego-dazwilkin-com" condition "Ready": "False" -> "True"; setting lastTransitionTime to 2018-05-16 21:56:56.842909838 +0000 UTC m=+7058.901958015
Certificate default/securego-dazwilkin-com scheduled for renewal in 1438 hours
certificates controller: Finished processing work item "default/securego-dazwilkin-com"
certificates controller: syncing item 'default/securego-dazwilkin-com'
Certificate default/securego-dazwilkin-com scheduled for renewal in 1438 hours
certificates controller: Finished processing work item "default/securego-dazwilkin-com"
Conclusion
cert-manager warns that this functionality is not yet ready for production deployments but, combining Kubernetes with mechanisms to auto-generate and manage certificates is terrific! You’ve no excuse to not be using legitimate (non-staging Let’s Encrypt or your preferred CA) certs for all your workloads.
Feedback is always sought
That’s all!
18–05–17 Update:
HTTPS Load-Balancer w/ non-TLS backend
In this variant, we remove TLS from the Kubernetes Deployment and apply the certificate generated by Let’s Encrypt to the Google Cloud HTTPS Load-Balancer.
The simpler main.go
is included below along with a Dockerfile that assume’s you’ve built the binary of it. I’ve pushed the image to DockerHub as dazwilkin/nontlsgo
. So you may just reference that directly.
Apply the Deployment nontlsgo.deployment.yaml
to your cluster. Assuming you’ve retained the cert-manager Deployment from the previous steps, you may reuse the Issuer but you must apply the newcertmanager.certificate.yaml
provided here. This Certificate request is for nontlsgo
. This is the Certificate we’ll use in the Ingress to provision the HTTPS Load-Balancer. Once you’re able to confirm that the certificate is created, apply the ingress.yaml
to provision the GCLB.
I was able to capture Let’s Encrypt challenge being provisioned by cert-manager into Cloud DNS in order that the Let’s Encrypt service is able to confirm my ownership of dazwilkin.com
:
Here’s the log output of the successful Certificate provisoning:
And we’re able to query the Certificate:
kubectl get certificate/nontlsgoNAME AGE
nontlsgo 7m
We need to expose our Deployment as a Service of — type=NodePort
:
kubectl expose deployment/nontlsgo \
--port=8080 \
--target-port=8080 \
--type=NodePort
And, this time, we’ll provision an Ingress which, in Kubernetes Engine, will provision a Google Cloud HTTPS Load-Balancer.
kubectl apply --filename=ingress.yaml
Here’s the view from the Services column of Cloud Console as the Ingress is provisioning the Google Cloud HTTPS Load-Balancer (GCLB):
We can see this through the Console’s Load Balancer column too:
NB Both HTTP (on port 80
) and HTTPS (on port 443
) are enabled on a VIP 35.190.27.1
. For HTTPS, Kubernetes has also provisioned the nontlsgo
Secret as a certificate for the LB.
Once the GCLB is provisioned, we should be able to hit the endpoint:
curl --insecure https://35.190.27.1
Hello Henry!
And, as before, we can inspect the certificate it’s presenting:
openssl s_client \
-servername nontlsgo.[[YOUR-DOMAIN]] \
-connect 35.190.27.1:443 \
| less
And, as before:
CONNECTED(00000003)
---
Certificate chain
0 s:/CN=nontlsgo.[[YOUR-DOMAIN]]
i:/CN=Fake LE Intermediate X1
1 s:/CN=Fake LE Intermediate X1
i:/CN=Fake LE Root X1
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=nontlsgo.[[YOUR-DOMAIN]]
issuer=/CN=Fake LE Intermediate X1
In this case, our backend service is not secured by TLS but we front-end the service with a GCLB that does require TLS.
HTTPS Load-Balancer w/ TLS backend
Aha! There’s some Alpha work to enable support for TLS backends with the GCLB running HTTPS, let’s explore:
https://github.com/kubernetes/ingress-gce/blob/master/README.md#backend-https
It works!
Apologies that my naming will make this appear more confusing than it needs be but what we’re going to do is reuse the configuration of the Ingress from above (with the nontlsgo
cert) and this time we’ll use the securego service (using the securego
cert) as its backend. So, the GCLB terminates the external TLS connection (using cert nontlsgo
) and the GCLB then re-encrypts traffic (as the client) using cert securego
to route to the TLS (!) based securego
Service.
Reuse everything that came before and use the following service.yaml
and ingress.yaml
:
NB I remap the ports in the service from the container’s
8443
→443
for the service. This is because the Alpha Ingress references port443
and I didn’t want to change too much and break something. I’ll try using8443
too.NB The Ingress includes the Alpha annotation that enables this use of TLS for the backend service:
service.alpha.kubernetes.io/app-protocols
. The value takes the port that we’ve namedhttps-port
and we specifyHTTPS
.
Here’s the GCLB that’s generated by the Ingress:
And, as expected:
curl --insecure https://35.186.202.30
Hello Henry!
And, as expected we’re served the nontlsgo
certificate because our interaction is with the HTTPS Load-Balancer. If we could peek behind the green curtain, we’d see the Load-Balancer itself receiving the securego
cert when talking to the Kubernetes backend Service.
openssl s_client \
-servername nontlsgo.dazwilkin.com \
-connect 35.186.202.30:443 \
| less
And:
CONNECTED(00000003)
---
Certificate chain
0 s:/CN=nontlsgo.[[YOUR-DOMAIN]]
i:/CN=Fake LE Intermediate X1
1 s:/CN=Fake LE Intermediate X1
i:/CN=Fake LE Root X1
---
Server certificate
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
subject=/CN=nontlsgo.[[YOUR-DOMAIN]]
issuer=/CN=Fake LE Intermediate X1
---
Production
For completeness, let’s remove the need to curl — insecure
.
Here are the scripts:
NB The securego Service is running on port 8443
. This proves that it’s possible to do this (and not use 443
) as questioned previously.
NB The Ingress reuses|assumes the existence of the TLS-based Service (securego
on port 8443
)
Once the endpoint is available, you will need to update your DNS recordset to reflect it and then:
curl https://tlsprdgo.[[YOUR-DOMAIN]]
Hello Henry!
That’s all!