Automated Kubernetes deployments with GitLab, Helm and Traefik
We all have this one issue: our app is ready, versioned using git where we use different branches for features, bugfixes etc. and it’s all nicely packed up into Docker images and now we would like to deploy these changes to our Kubernetes cluster but it kind of feels boring building the images , pushing them to a Docker registry and then deploying them to the cluster manually. After all that’s what CI/CD is for, right? And also, wouldn’t it be great if we had different environments for different git branches so we could test things easily and isolated and ideally automatically get Let’s Encrypt SSL certificates for every branch URL?
I’ve searched the Internet for a blog entry like this and I found many similar things but none of them actually really fit my needs. Some used some awkward proxy setups using HAProxy which in my opinion needed way too much configuration to run on Kubernetes, others focused too much on a certain app structure and then there are these blog entries where I read five paragraphs and I was already lost.
So, this is my attempt to document this topic and because I know your time is precious and you don’t want to read a whole blog entry only to realize that this was another blog post that didn’t help you with your task, here’s a short list of what I’m going to cover:
- I’ll cover how to get your code from GitLab (but should work with any CI/CD system) to a Google Kubernetes Engine (GKE) cluster.
- We’ll have a production system which is our
master
branch and an arbitraryfeature1
branch that represents our feature environment but you’ll be able to have as many feature environments as you like. You’ll be able to access yourmaster
branch usingyour.domain.com
and thefeature/feature1
branch usingfeature-feature1.domain.com
. - We’re using GitLab’s private Docker registry and will thus make sure, GKE can access it.
- We’re using helm to deploy our application. Your application must thus be prepared to be deployed as a chart! We are not using GitLab’s Auto DevOps feature!
- We’re using Traefik as our reverse proxy, simply because it’s super easy to use, integrates natively into Kubernetes, has out-of-the-box Let’s Encrypt support, is blazingly fast and because I like French cheese (which is a strong statement if you’re Swiss like me!).
I would like to make one thing clear before I start: If you haven’t used Kubernetes before, never used the CLI tools helm
, kubectl
or gcloud
or you’re kind of new to Docker then I’m sorry but I think you should start with the basics. This will likely be too advanced for you. I’m trying to describe the steps as detailed as I can but I won’t be teaching you fundamental knowledge about all these tools. You have been warned!
Prerequisites
So to start off, I expect you to have your GKE cluster ready and you have helm
, kubectl
and gcloud
installed locally. Also, you should have your application dockerized and ready to be deployed as a helm chart. I personally tend to just add a folder named k8s-chart
to my projects where I put all my chart logic in so I can then deploy the application using $ helm install -n my-super-app ./k8s-chart
. If you haven’t packed up your app in such a way yet, you should probably stop here and try to do that first.
#1: Ensuring GitLab CI/CD can access your GKE cluster
As during our GitLab CI process, we want GitLab CI to execute commands on our GKE cluster, we have to make sure GitLab CI can access it. We can achieve this by creating a Google Service Account. For this go to IAM & admin > Service accounts
or just visit https://console.cloud.google.com/iam-admin/serviceaccounts). A service account is a special type of Google account that belongs to your application or a virtual machine rather than an individual user, so it’s perfect for our purposes. We’ll give it a name and a description, say, gitlab-ci
and something like Service account for GitLab CI to automate deployments
but of course you’re free to choose whatever you like.
Then we click on Create
to get to step two where we can grant this Service Account access to alter our Kubernetes resources. For this, we have to assign the role Kubernetes Engine Developer
to the Service Account:
Almost done! We only need to generate a key on the last step and download it in the JSON
format:
You’ll end up downloading a JSON file called something like <your-project-name>-<hash>.json
. Got this? Excellent!
Now, I have to admit I’m a bit of a paranoid when it comes to encoding issues. Just because more than ten years in IT made me like that. I don’t like to pass things that contain quotes and special characters (like a JSON file could contain) on command line just like that. But we will have to pass this data to our GitLab CI/CD job so we’ll have to place that data into a GitLab CI/CD environment variable. But because I’m paranoid, we’ll base64-encode that data first and will decode it again during the CI job. Whether that makes sense or not to you doesn’t really matter. It’s just my way of trying to stay away from escaping issues with environment variables :-)
Anyway, to encode that JSON key just run $ cat <your-project-name>-<hash>.json | base64
and if you’re a Mac user like me you might just pipe this again so you have the result of it in your clipboard right away like so: $ cat <your-project-name>-<hash>.json | base64 | pbcopy
.
We then go to our project on GitLab and navigate to Settings > CI / CD
where we can add the encoded key as an environment variable. I give it the name GKE_SERVICE_ACCOUNT
but as long as it matches with your .gitlab-ci.yml
you’re creating later on, you can name it however you like it best.
Alright, cool! Now we have an environment variable named GKE_SERVICE_ACCOUNT
which contains our base64 encoded JSON Service Account key which we can then use during our CI / CD jobs to access our GKE cluster!
#2: Ensuring GKE can access our private GitLab Docker registry
Basically, we need to do something similar we just did but the other way around. We’ll be pushing our Docker images to our private GitLab registry, so during deployment, GKE must have access to our GitLab Docker registry to be able to pull the latest versions of our images. For this to work, we have to create a Kubernetes secret of the type docker-registry
with the contents of a personal access token of GitLab. So go to your GitLab profile and
create a new personal access token and name it e.g. gke-gitlab-docker-registry
with the scope read-registry
:
Cool, now let’s store this access token as a Kubernetes secret on your GKE cluster. We’ll be using this secret later on. As you can see, the first argument of kubectl create secret
is the secret type (docker-registry
) and the second one defines the name of the secret. I’m going to name it gitlab-registry
so the command you need to execute looks like this, whereas YOUR_PERSONAL_GITLAB_ACCESS_TOKEN_HERE
should be replaced by the token we’ve just generated and I hope you also remember your GitLab username and e-mail address:
$ kubectl create secret docker-registry gitlab-registry \
--docker-server=registry.gitlab.com \
--docker-username=YOUR_GITLAB_USERNAME \
--docker-password=YOUR_PERSONAL_GITLAB_ACCESS_TOKEN_HERE \
--docker-email=YOUR_GITLAB_EMAIL_ADDRESS
Luckily for me, the token only contains unproblematic characters so we don’t need to do any base64 encoding and decoding magic here :-)
#3 Allowing helm (tiller) to manage your GKE cluster
Because we want to be able to deploy our application using helm, we need to make sure the server side component, called “tiller”, has the correct permissions to do so. This is actually out of the scope of this blog post but at the same time it’s also pretty easy to do so because all you have to do is to create a ClusterRoleBinding
so tiller gets the cluster role cluster-admin
assigned. How you can do that is already described in the documentation of helm itself.
If you haven’t installed tiller yet, you should do so by running $ helm init --service-account tiller
as described in the docs.
#4: Assigning a static IP to your GKE cluster
To make our setup publicly accessible, we need to register a static IP for our cluster. This can be done in VPC network > External IP addresses
or just go visit https://console.cloud.google.com/networking/addresses/list.
In case you haven’t already assigned a static IP to your cluster, you should have an “ephemeral” IP assigned to your cluster. Just click the dropdown and make it a static one.
You will then get a static external IP in the column “External Address”.
#5: Take a break!
So far all we did was preparation work but I promise you, we’re getting there! We only need to set up Traefik and then we can start working on our .gitlab-ci.yml
and finally benefit from all our preparation work. Time for a coffee!
#6: Setting up Traefik
Traefik is a super-fast proxy that has native support for Let’s Encrypt SSL certificates, natively integrates with Kubernetes and is able to route incoming requests to the correct Kubernetes services by just using Kubernetes labels. This comes in super handy for us because it means we can push any branch, make sure we deploy the services to Kubernetes with the correct Docker image and labels and then Traefik will find our services, make them available to the world and create SSL certificate for us automatically!
And because we have helm, installing it on your cluster is almost too simple:
$ helm upgrade \
--install \
--namespace kube-system \
--set rbac.enabled=true \
--set imageTag=1.7 \
--set ssl.enabled=true \
--set ssl.enforced=true \
--set acme.enabled=true \
--set acme.email=<your-email-here> \
--set acme.staging=false \
--set acme.challengeType=tls-alpn-01 \
--set acme.persistence.enabled=false \
--set loadBalancerIP=<your-public-static-ip-from-the-gke-cluster> \
traefik \
stable/traefik
Now there’s a few things to explain here:
- I’m using
helm upgrade
with the--install
flag instead of usinghelm install
because it is easier for me. It will install it if the release is not there yet and otherwise just upgrade it. This is how I only need to remember one command if I want to later on e.g. add another--set
flag but of course, do it the way you prefer! - We’re enabling role based access, I recommend you always do that. There’s a reason why there are permissions in Kubernetes :-)
imageTag
defines the version. I’m using1.7
which is the latest version at the time of writing this blog post.- Replace
acme.email
with your e-mail address and configureloadBalancerIP
so it is set to the static IP we’ve prepared in step #4. - You might want to use different settings such as enabling persistence for certificates or enabling the dashboard, debug mode, logging etc. You’ll find more options in the Chart documentation itself. I encourage you to explore more on Traefik, it really is an awesome piece of software!
Finally, let’s create our .gitlab-ci.yml
!
#7: Let’s start working!
Okay Ladies and Gents, it’s time to put it all together and finally work on our project! We want to be able to use GitLab CI/CD to deploy our very own helm chart of our application to our Kubernetes cluster and we’ve done quite some preparation work up until here so we need to get some results now! Now the thing is, for your .gitlab-ci.yml
and of course also for your helm chart in ./k8s-chart
, the sky is the limit! Because you can combine them with loads of different environment variables, pass them on from your CI job to your chart using helm --set
etc. you can really go crazy now! I cannot cover everything in this blog post. Of course you should run unit tests before you deploy, of course you might want to scan your Docker images for issues etc. but really, this should be the starting point for you so you can continue working on your very own version of your .gitlab-ci.yml
. It does definitely not end here!
So what I’ll do is I’ll outline a very basic helm chart which consists of a Kubernetes Deployment
, a Service
and an Ingress
based on Traefik that is everything you need to make your application available to the world. It’s going to be our web
deployment. Alongside the basic chart, I’ll post the most simplistic .gitlab-ci.yml
you could imagine that works together with this chart. I hope you can then continue from here :-)
So this is how our ./k8s-chart
directory looks:
/k8s-chart
/templates
web-deployment.yml
web-ingress.yml
web-service.yml
Chart.yml
Of course, your chart directory might have requirements.yml
and requirements.lock
, values.yml
etc. pp. but again, we stick with the very simplistic approach here!
Okay, so let’s start by looking at the three templates one by one. We’ll start with the web-deployment.yml
first:
As you can see, we use two values here web.name
and web.image
. That’s because we want to have different images (one per branch) and also we want to have different names when we deploy, otherwise how would you know which deployment belongs to what environment, right? Other than that it’s really nothing special here except for spec.imagePullSecrets
where you can see that I’ve told Kubernetes to take our gitlab-registry
Kubernetes Secret we have configured in step #2. That’s how it’s going to be able to access our GitLab Docker registry. Also, I’ve configured imagePullPolicy
to Always
so that it really always pulls the latest images. Otherwise if your image tag does not contain e.g. the SHA commit reference but only the branch name, the image tag would stay the same and never get pulled again.
Now on to the web-service.yml
which makes our deployment available to the cluster:
This is really super simple: Again, we have to make sure the web.name
is used and of course we also have to do that in the selector
section so our service finds the correct pods. Other than that, this service just exposes port 80 and that port is named http
, that’s it!
Now to the maybe most important one, the web-ingress.yml
:
The special thing here is the metadata.annotations.kubernetes.io/ingress.class
which is set to traefik
. As I’ve mentioned before, Traefik integrates natively with Kubernetes so what you’re doing here is basically telling Kubernetes that Traefik shall handle the ingress, not the native Kubernetes implementation. This is also why the spec
might look alien to you because this is now how Traefik can be configured. You can configure loads of stuff but for us, we really just want to tell it to match to a specific host, which we pass again using values (web.host
) and forward this to a backend which is our service named web.name
and there on the port which is named http
. So now I hope you can kind of see how web-ingress.yml
goes together with web-service.yml
:-)
Nice! What we can do now is by setting the right values for web.name
, web.image
and web.host
we can tell Kubernetes to pull the correct image, deploy it with the correct name and assign the right labels to the Ingress so Traefik will be able to find it based on the host! Half way there!!
So we’re left with our gitlab-ci.yml
which needs to do the following:
- Build the image and include the branch name somehow.
- Push the image to the GitLab Docker Registry
- Deploy to the GKE cluster using helm and setting the correct values for
web.name
,web.image
andweb.host
Again, I’ll just paste the contents of how it could look like and explain it afterwards:
So starting at the very top, I like to simply use docker:stable
as the base image for my CI/CD jobs and then include the Docker-in-Docker Service (docker:dind
) because that allows me to basically do anything at every stage as I can re-use docker again and e.g. use $ docker run
in one of my steps again. But again, you’re free to choose your own setting.
Then we define two stages, build
and deploy
which should be pretty self-explanatory.
In variables
we can define our environment variables, these are then passed on to the jobs. You do not necessarily have to list them here, you could also just add them to your CI / CD Settings
like we did for GKE_SERVICE_ACCOUNT
. In fact what happens is that these just operate as fallbacks. If you did not define them in the project settings the job will take what you defined in variables
. I kind of like adding everything that actually is configurable here because they serve as sort of documentation for me.
IMAGE
is likey the most important variable. It’s a combination of two variables that GitLab provides.${CI_REGISTRY_IMAGE}
contains the whole path to your GitLab private registry including the project path, so e.g.registry.gitlab.com/vendor/my-project
. This plus/web
plus another very useful GitLab environment variable named${CI_COMMIT_REF_SLUG}
will give us our final image name.${CI_COMMIT_REF_SLUG}
is super useful because it contains the git branch name (or tag name) and sanitizes it so that it can be safely used in URL’s, names etc. (checkout more here). So if we push a branch namedfeature/feature1
we’ll get this:registry.gitlab.com/vendor/my-project/web:feature-feature1
, perfect!GKE_SERVICE_ACCOUNT
should be clear to you by now :-)GKE_CLUSTER_NAME
,GKE_ZONE
andGKE_PROJECT
could be used to adjust the GKE cluster name, zone and project name if you like via CI / CD settings. But again, you can add defaults here and they’ll be used. They absolutely don’t need to be here. We could also hard-code the values in theinit_helm
script (we’ll get to that) but I just like to have things configurable :-)URL_REVIEW
contains the URL for non-production branches (so e.g.feature-feature1.your.domain.com
URL_PRODUCTION
contains the production URL
Okay, let’s move on to the CI steps then!
Our build
job is not too complicated either. It logs in to our GitLab Docker Registry using the GitLab provided environment variable $CI_BUILD_TOKEN
and then builds our image from our ./Dockerfile
and pushes it. The important thing here is that we use our environment variable IMAGE
which contains our image name and the branch! Given we pushed to our branch named feature/feature1
it will thus build a web:feature-feature1
image and push this to our registry. The docker pull
and --cache-from
combination is just a CI/CD optimization so it actually pulls a probably existing image first and re-uses the Docker layers it can from there. It’s generally a good idea to do because it’ll speed up your builds dramatically!
So we have the correct image built, tagged with our git branch and pushed to our Docker registry. The only thing left is to deploy it to our GKE cluster. For that, we use two different steps, deployment_review
and deployment_production
. The main difference being that the GitLab environment
is hard-coded to production
for our master
branch and of course the host is the production URL. GitLab will automatically use deployment_production
every time we push to our master
branch because we specified only.refs: [master]
and for every other branch it will take deploy_review
as we specified except.refs: [master]
.
Possibilities here are endless. You could for example restrict a job to only tags
only branches
or even a branch regex in case you only want to deploy branches starting with feature/
, time to get creative!
Okay, last bit here are the two scripts init helm
and then helm upgrade [...]
.
Our first “problem” is that to be able to deploy the application we need the command line tools helm
and because helm requires kubectl
we also need that one. And then we also need to have access to our GKE cluster so we also need gcloud
to be able to authenticate against our cluster.
If you’re as paranoid as I am, you would go on and build your own Docker image that contains these 3 tools. If you’re not, there are a ton of images out there that contain exactly those :-) For demonstration purposes I’ve used the devth/helm
image.
Okay, tools are ready then, now on to the small init_helm
script I wrote! It will take our beautifully base64 encoded GKE_SERVICE_ACCOUNT
environment variable, decode it and write it to /etc/deploy/sa.json
which we use immediately after decoding again to activate the service account using gcloud
and get the credentials from the correct cluster, zone and project (which can be configured using environment variables, you remember?). This will configure our kubectl
in such a way that it will take our service account for any commands that follow.
To be sure that helm is correctly initialized and runs the latest version, we also run a helm init --service-account tiller --wait --upgrade
here. By the way, if you wonder why I’m using --wait
for all helm commands: I just want the command to really wait until the tiller pod returns if an action was successful or not. Otherwise our build would not fail if there was an issue with this command which is not desirable during a CI job, is it?
And that’s really most of the magic because what follows is just a regular helm upgrade --install
command. It again uses our environment variables to create a deployment named after our git branch and sets the correct values for our helm chart. So given our master
andfeature/feature1
branch3w, what you’ll end up with is two helm releases:
- my-super-app-master
- my-super-app-feature-feature1
Both running their own deployment, service and ingress and correctly routing their public domains
to your pods/containers with Let’s Encrypt certificates for free.
DONE!
Pretty awesome, if you ask me!
I hope you enjoyed this blog post, let me know if it helped you or if you’re doing things differently so I can continue to learn new things myself :-)