A Journey to GitOps

Published in

Dyninno

5 min readJan 19, 2023

By Vadim Gedz
Lead DevOps Engineer, Dyninno Technologies

I work for a company that carries out full-stack software development, providing hi-tech infrastructure, testing tools, and helpdesk support to several businesses.

Dyninno Group makes strategic business decisions based on digital data analysis; we maintain and support a real-time flight search engine; we manage a casting social network and provide new payment solutions. Our group of companies is large — we work in 3 business sectors and have more than 4,000 employees worldwide, so we need to build infrastructure and make changes quickly and correctly. Along the way, in Dyninno Technologies — the IT hub of Dyninno — we’ve faced some challenges with the deployment of our system.

The problem

When I joined Dyninno, we had quite an unexpected approach to deploying applications to the Kubernetes cluster that required rendering Helm charts, adjusting them for our needs using Kustomize, and applying them using a huge Makefile. Although this method worked, it was challenging to call it optimal.

Why? Because such an approach requires local actions, people tend to forget to commit local changes back to the git repository, which can and most likely will lead to a situation where the new changes revert previous ones that were not committed. Coming from a company using Weave Flux in all Kubernetes clusters, it was a no-brainer that GitOps would improve workflow here. While we had such a setup configured in a few clusters, it was far from standard, and the percentage of configuration managed by it was so low that it would be easier to say we didn’t have it at all.

Using GitOps, we could define the desired infrastructure state in a declarative way in a git repository. A controller in a Kubernetes cluster takes it as a source of truth, applying changes to the cluster itself. But before we started working on integrating it, we had to find answers to the following questions:

Which GitOps solution would be the best for our needs?
How can developers see what happens with their code when a CI/CD pipeline is finished?

Why ArgoCD?

One of the crucial questions was how easy it would be to understand what was happening during the sync process. While the logs provided by alternative solutions are mostly more than enough to get the whole picture, sometimes it becomes hard to find the message related to the specific issue, especially if you have a Helm chart with dozens, if not hundreds, of resources. ArgoCD offers an excellent web UI with a per-resource status that helps to reduce the time needed to pinpoint any potential problem. This alone was a huge plus compared to the Weave Flux.

We are going towards standardization of all Kubernetes environments. Still, since we have a lot of various teams with different visions of which technologies should be used due to some project specifics, we had to support multiple use cases. One of the critical aspects was secret management. As we’d decided that GitOps was what we wanted to achieve, manual secrets creation was a no-go. We had a SealedSecrets controller, already introduced into multiple internal projects. Still, it required direct access to the cluster and at least a basic understanding of how to work with it. But do all developers need to know how to do that? No.

Does ArgoCD provide a solution for secrets management out of the box? No.

But ArgoCD allows you to extend its functionality using external plugins. One of them is a great plugin (https://github.com/argoproj-labs/argocd-vault-plugin) that provides the possibility to inject secrets from HashiCorp Vault. This approach permits developers/non-IT people to manage credentials through an easy-to-use web UI.

A question about visibility

Although we prefer to follow the least privileged principle, sometimes it is necessary to grant developers some visibility into what is happening with their code after it reaches the actual environment. With the UI mentioned above and proper RBAC configuration, it is possible to give very restricted per-application access that is more than enough to provide a general understanding of what is happening.

CI/CD and Argo Watcher

How does it fit into the CI/CD? How do we deliver images to the actual environment?

We’ve decided to go the following way:

The image is built and tagged using the pre-defined naming convention.
The ArgoCD Image Updater detects that the image that matches the configured update strategy has become available and makes the commit to the related git repository.
ArgoCD syncs the changes to the cluster.

But does it make the process transparent? Not really, as there is no feedback after the built image is published to the Docker registry. We looked into a number of options and concluded that none of them could be of assistance to us. After some internal discussion, we decided to write our own tool in our spare time, which led to Argo Watcher (https://github.com/shini4i/argo-watcher) being born.

What does it do? It stands between pipelines and ArgoCD.

The pipeline sends a task to the Argo-Watcher and then asks it for an update.
The Watcher calls ArgoCD API and checks if the required application runs the expected image tag.
If the expected image is detected and the app is healthy and synced, we consider the deployment successful.
If after the pre-defined timeout, the app is still running on the old image or is not synced and healthy, we consider the deployment to have failed. Additionally, it acts as a centralized dashboard that checks all deployments across all projects. As not everyone has access to all the related git repositories, it helps a lot.

*The Simplified Diagram of the Deployment Process*

How to manage the tool that should manage everything else

But how do you manage a tool that should manage everything? We’ve tried multiple approaches here, but in the end, we agreed on the following procedure:

We are using Terraform to bootstrap everything infrastructure related.
We create initial namespace/secrets for ArgoCD using Terraform.
We deploy ArgoCD itself using Terraform’s Helm chart provider.

And starting from this point, everything else is done via commits to the git repository.

Results

All these steps resulted in the following:

First, it has become much simpler to track changes. Now that everything is centralized, there is no need to search through dozens of repositories, which saves time.

The second and most important, it has almost eliminated the chance of configuration drift (when the configuration in the git differs from what it is). ArgoCD automatically removes any manual change made without git commit. This approach establishes a proper workflow that helps to get rid of unwanted surprises.

Is there room for improvement? Of course! As a next step, we can start managing Terraform via Atlantis, introducing GitOps into the provisioning. The initial steps have already been taken; let’s continue this adventure in upcoming articles.

Vadim Gedz, Lead DevOps Engineer, Dyninno Technologies
https://github.com/shini4i

A Journey to GitOps

Published in Dyninno

Written by Alina Sh

No responses yet