GitOps for Kubernetes with Jenkins
While GitOps outlines several principles, it is possible due to certain constraints of time or resource you may not be able to follow all of them. As with everything, you can choose to prioritize to fulfill the essence, or core principles first and then later down the road complete a full implementation. In this blog we’ll first take a quick look at the what and why of GitOps, how to implement it into your DevOps processes, and finally an approach to implement GitOps in a light manner. There are tools such as ArgoCD and Weave Flux, that are purpose built for GitOps. We also have Jenkins-x, while technically purpose built for CI/CD on Kubernetes, but also has the ability for implementation of GitOps processes. In today’s blog however we are looking at a GitOps implementation using good old Jenkins. Yes that may see a few steps backwards especially after making mention of the other tools I just did. However, we understand sometimes it is difficult to change your CI/CD tool, and Jenkins being one of the most popular ones, it’s a good idea to take a look at how we can implement GitOps using Jenkins or other comparable CI/CD tool. We will mainly look at a particular set of tools that we use at Stakater, however the concepts should translate easily to other tools of the same nature.
GitOps is a methodology of Continuous Delivery, at the heart of which is the principle that Git be used as the source of truth for declarative infrastructure and applications.
Declarative description of system
Any and all changes made to the application, environment, deployment and infrastructure should all be handled via git. This therefore means that we should declare all these as code and maintain in a git repository. Code for the whole system will include the following:
Infrastructure can be declared in the form of Terraform modules, Cloudformation script, or other. At Stakater, we also use Kops for AWS, which in addition to setting up the Kubernetes cluster, also creates the underlying AWS infrastructure prior. This is declared using a yaml file.
Kubernetes configuration includes details of deployments, replicas, services and container images to deploy etc. It can be declared at the least via vanilla yaml manifests. A better approach that we use at Stakater is to use the Helm package manager for Kubernetes. Helm encapsulates the configuration in the form of reusable Charts, which are groups of yaml and value files.
Environment configuration is also handled within the Kubernetes configuration, and handled via the Kubernetes API object, ConfigMap. The ConfigMap is also declared in yaml, and can be expressed inside a Helm chart as such for a particular deployment, or in an independant chart for itself as well.
Application code is of course traditionally maintained in git already, and needs to be packaged within docker image, which is declaratively expressed in the form of a Dockerfile. The Dockerfile should be included within the Application code repository.
Changes through Pull Request
Since our system is declared as code and maintained in Git, the consequent principle to follow is that all changes that need to be made, must be done so through a pull request. Using trunk based development, a master git branch is used for reflecting the current state of the system. Any updates that may be needed can simply be done by opening a new pull request (PR) on the master branch. Once the PR is merged, the CD/GitOps pipeline is triggered, and deploys the required changes to the kubernetes cluster. In case any change needs to be rolled back, to a recent or even older state, that will also be performed by a pull request. We will explore this in more detail in a subsequent section.
Since the git repository is the source of truth for the system, at any time we can see the intended state of the system by scanning the git repository. In case any application pod crashes, or the system is unintentionally modified manually, the GitOps pipeline will rectify to keep the system state in sync with the declaration in the repository. This principle is built into one of the purpose-built GitOps solutions, however in our case we will see this will not be implemented with Jenkins due to its complexity.
GitOps with Jenkins
You may see discussions that implementing deployment with a CI server is an anti-pattern, however, with some tweaks we will be able to workaround some of the reasons for that argument, which isn’t to say that it will be enough for a proper GitOps implementation.
Firstly, it is argued that the CI pipeline should not directly deploy. Indeed we will see that we do not take this route. We essentially maintain separate repositories for the application code, and the cluster configuration. And in doing so we also have separate pipelines for each, the latter being responsible for the deployment.
And secondly, keeping a clear security boundary between the code build pipeline, and config deploy pipeline can be achieved by using two instances of jenkins.
The first thing to do is to have your code and configuration repositories separate. We will take a look at one of Stakater’s opensource projects, Xposer as an example, to illustrate the points that are being discussed. You can follow the github link to get a better view of the individual files.
The code repo will have the application code including a Dockerfile and helm chart. The reason for this is that the Dockerfile and Helm Chart at this point are serving only as a packaging tool. The responsibility and knowledge of how the application code should be packaged should be included within the Application code repository. In the following screenshot we can see the code repository for Xposer. As visible, apart from the application code, we also have the Dockerfile, Kubernetes manifests (for completeness sake), Helm Charts, and Jenkinsfile.
Next we need a configuration repository. This in fact can in most cases be used for a group of deployments that are related either by namespace, or other dependencies. For this we use the umbrella chart pattern of Helm. The repository will have a list of required charts in yaml, and for each chart the custom values that will be used for the deployment. The required charts are listed in a yaml file as below:
The customer values of the charts are defined in individual value yaml files as below:
The following example is from a different project, and shows how other deployment configuration where parameterized, can be customized in an umbrella chart value file. This way we customize the number of replicas, the health checks, the tolerations, etc. for a particular deployment in our configuration repository.
When a change is merged into the Code repository, The code pipeline runs in Jenkins, and upon success it pushes the packaged docker image and helm chart to the artifact repositories, Nexus and ChartMuseum respectively in our case. This pipeline will us a pre-baked builder docker image on the Jenkins slave to build the artifacts. This builder image may include tools such as the JDK, npm, go releaser, etc. depending on the project. At this point while our new project version is packaged into our artifact repository, it has in no way been indicated that it should also be deployed. That part will be handled by the Configuration repo pipeline.
As discussed, the configuration repository has a separate pipeline, probably that will be run in a separate Jenkins instance. To deploy our new project version, we will need to create a pull request and merge changes into our configuration repository. In the case of our Xposer project, we can create a pull request to update the chart version in the requirements.yaml. If any configuration values also need to be updated, for the new deployment, we can do that in the values file, xposer.yaml, and create a single pull request for these changes. Once the pull request is merged, the Jenkins pipeline runs and on success, it updates the deployment on the cluster. This pipeline will use a pre-baked deployer docker image on the Jenkins slave to deploy the charts. This builder image may include tools such as the kubectl client or helm client.
The image below illustrates the flow discussed.
Looking at the diagram we can get a bird’s eye view of the implementation. As discussed we want to maintain everything in git as per GItOps principles, and all updates will be done with git operations, such as pull requests. Our code and configuration are defined in separate repositories since the scope of each is different. The code repository merely defines the software artifact. It is the configuration repository declares where and how the deployment of this artifact should take place.
Additionally, the two repository CI and CD pipelines are ideally handled by independent Jenkins servers. This allows each server to receive only the required security level. The Jenkins instance for the code repository will have access to the code repository obviously, and write access to the artifact repositories, and it will not need any access on the configuration repository or deployment related access level on the cluster. The Jenkins instance for the configuration repository will not need access to the code repository, and will have access to the configuration repository, read-only access to the artifact repositories and deployment access on the cluster.
This way we do satisfy the core principles of GitOps. One thing that remains missing is for the deployment configuration to be self-healing. In case someone manually updates the deployment on the cluster, using kubectl client for instance, our Jenkins instance will not be aware of this, as it is not watching the cluster status. The deployment will only be fixed in such a case when the Jenkins pipeline for the configuration repo is rerun manually, or runs automatically triggered by a new updated merged to the configuration repository.