The History of DevOps: From the Beginning of Time to ArgoCD and IaC

Published in

Lonto

10 min readMay 24, 2023

Hi, my name is Sergey, I am the head of the DevOps unit at Lonto.

We periodically partner with interesting guys and organize events together. The examples of such events are meetings of entrepreneurs, online flat parties, and product management days. This series of three articles is a text version of the speeches from the CTO Day event. The report has two speakers:

Sergei Malenko, Head of DevOps unit at Lonto
Sergey Bondarev, Architect at Southbridge

The report was devoted to the history of app deployment, the main models used for it, and their comparison. We discuss the pull model in detail and show how to use “advanced” tools to manage the infrastructure of large projects and allow developers to independently order elements in the infrastructure for the needs of their apps.

All in all, we have prepared three articles based on the report. They will follow this order:

The history and how things evolved ←you are here 🚩
To understand the context, we will check out how the approach to iterative app deployment changed over time.

What is ArgoCD and why do we need it, with use cases
We’ll talk about a relatively new turn in the app deployment methods and see what questions can be solved with this tool.

How to manage infrastructure in GitOps using Crossplane
A new approach to IaC and how it can be combined with Argo CD

This part was prepared by Sergei Bondarev. Here, we will tell how coders used to work in the past:

FTP protocol
Git
Assembly automation
Deployment automation
Kubernetes
Helm
Dynamic environments
Classic push model
Disadvantages of Helm
Pull models and GitOps
What’s next

FTP protocol

FTP is an old protocol for transferring files over the network. It appeared in 1971, long before HTTP and even TCP/IP. The coders would write code on the computer, then FTP it to the server, and get the website. Strong coders wrote the code directly on the server, usually on the production:

Git

When there became more code, they invented GIT: a place to store the code.

The programmers write code on their machine and send it to Git via Github/Gitlab. Then they connect to the server, run the “git pull” command there, get all the code they saved in the repository, restart the services, and the production stand gets updated:

Assembly automation

Later they decided that it was a bad idea to put the code directly to the server. E.g. because someone might forget to do something manually.

So they started to compose packages for operating systems, like RPM or Deb, and store them in separate repositories and artifact collectors. They also started to automate this process.

That’s how they came up with the first CI: running automatic jobs over the push to the repository. The tasks were building packages and sending them to the repository. Then the coder would go to the server and install the package manually.

Deployment automation

Then they decided to use a robot instead of a person who would perform the “yum install.” When the package is built, the robot connects to the server and runs a script, such as the Ansible script. This script performs the “yum install” and the production stand gets updated and restarted:

Later, Docker came along. It became possible to pack a project into an image, which is much easier to run on any OS. Unlike the specification for the RPM package which takes a long time to learn, the Docker file is a simple set of commands. With Docker, the user immediately gets a backup of the operating system. And the app works just fine in it. The package build is replaced with a Docker image build, it is pushed to the image repository, and after that the CD Job runs something like a “docker compose up.” All services are initiated on the same server:

Kubernetes

When a project grows, the issues of fault tolerance and lack of space on a single virtual machine arise. To address these issues, orchestrators were invented. The most popular of them is Kubernetes. The appearance of orchestrators brings a new approach: a microservice architecture. This approach assumes splitting huge services into separate ones with each of them responsible for their area only.
Now we can replace the docker-compose deployment by using kubectl in a CI job and keep the automation.

You can check out our article about Kubernetes and other orchestrators.

Helm

Let’s imagine that you have many projects, and YAML manifests for Kubernetes are similar to each other. They can be templated, which is why Helm is connected. It helps to manage the apps installed in Kubernetes.

You need to replace the application via the kubectl apply with helm upgrade and install the apps in Kubernetes:

At some point, the coders are no longer satisfied with only one cluster for production. They want a full-sized environment that is like a production environment, but not exactly. They need it to test new versions and do integration testing. DeVops make a second Kubernetes cluster, put a domain like test.example.com on it, and give it to coders. It goes like this:

The coders try it, but it turns out that there were too many development teams. They need a lot of stands for everyone to be able to work with their test cluster. DevOps make five stands manually. So they end up with one production cluster and one test cluster with five different namespace stands for environments:

It turns out that five stands are not enough. The development team grows and develops many more features in parallel. So coders have to wait for each other to get access to a freed up stand.

Dynamic environments

Stands are made dynamic: they are initiated for each feature automatically. We add the “build environment” step to the CI/CD Job. Now, when a coder wants to make a new feature and creates a new branch in the repository, the testing and development environment are initiated for it:

Classic push model

If we compress the pattern of dynamic environments, we get a classic push model.

The “push the code to repository” event starts the process. This process builds the environment for testing/production, assembles the project and artifacts, starts the process of deploying the project to this environment, and tests it. As a result, the app is automatically deployed to different environments:

Advantages

Easy to implement. Uses a traditional approach with known utilities. This model is the easiest to implement within the team in the company.

Versatile. You can deploy models for any load and achieve versatility in CI/CD approaches. The push model is easy to adapt and is often used for CD. It is convenient because the CD is a script in Bash or another automation language. It performs commands that you would otherwise do manually: “kubectl apply”, “helm upgrade”, and so on…

Easy to expand. In real life, the pattern develops to this scale:

Disadvantages of Helm

Helm is intended to be a template for YAML manifests and a deployer of those manifests into Kubernetes clusters. Helm is good at making YAML manifests from templates. But there may be problems when transferring them to a Kubernetes cluster.

Inane errors. Those who have worked with Helm have probably seen errors like this:

Error converting YAML to JSON: yaml: line 12: did not find expected key

It is not clear in which file and which template this line 12 is. The template in YAML can sprawl over two hundred lines. The only option you have is to use the trial and error method, that is to go to the templates and cut off half of them. Then you check out if the error remains and look for it that way.

You can’t check out the logs of the failed pods. Those who used Helm Chart with a “wait” parameter had a similar case. The coders wait for the pods to rise in the chart and for the chart setup to complete. They usually specify the “atomic” key, which allows you to roll back if the installation fails. The coders put a “wait” of 500 seconds and the “atomic.” The process of deploying the chart initiates. 500 seconds go by, suddenly the release doesn’t install, and everything rolls back:

kubectl logs admin-687fcb89ff-knk93
Error from server (NotFound): pods "admin-687fcb89ff-knk93" not found

It turns out that log collection is centralized. You have to look for them with Kibana: by time, namespace, and pod name. Maybe you’ll find something…

You can’t check out the logs of the pod with the simple kubectl logs command, because by this point Helm has already deleted all the pods that were in the CrashLoopBackOff state and have not become “ready.” Helm does it to roll back to a previous release.

This problem can be solved by Helm hooks. They are triggered if a release which pulls logs from the failed pods has not passed. But here comes another problem: time-outs.

Pending-upgrade status.We set a timeout of 500 seconds. If the hook does not work within 500 seconds, the chart gets the “pending upgrade” or “pending install” status:

helm ls -a

NAME    NAMESPACE  REVISION  UPDATED    STATUS                CHART                                
admin     dev23                    3            2022-07-25  pending-upgrade  admin-3.3.1

This is the worst thing that can happen to Helm. After that it refuses to work with the chart and hangs up. The CI is launched for the second time with the “helm upgrade” command. It results with an error and says that someone is already working with the chart.

It turns out that there is a process that keeps track of what is going on in the cluster. After that, it gets finished, does not clean up after itself, and remains in a suspended state. The second time it starts up and kind of tells: “Someone is working there, I don’t know who, but I won’t do anything just in case.”

To solve this problem, you need to delete the whole chart or do the Helm Rollback: manually rollback the release to the previous one. But these are the manual operations that break automation at its roots.

Pull models and GitOps

We have a CI system: developers push into it, then the pipelines go into the infrastructure and change something. For example, the database or ingress-controller connection settings:

In the pull model, the situation is somewhat different. There appears a repository in which developers or admins store the configuration. An additional element is added: the agent. Its task is to synchronize the state that is described in the repository with the infrastructure.

GitOps is one of the implementations of the pull model. Git acts as a repository for all configurations in it. The source of truth is Git. It is the only point through which the changes to the infrastructure are done. Then the agent applies the changes and maintains the given state.

Features of the GitOps model:

The source of truth is Git
All actions are done through Git only
An agent that synchronizes the state is added

Let’s say we changed something manually in the infrastructure. For example, the number of replicas of the app. The agent will realize that this does not match with the settings written in the source of truth and will return everything to its original state.

Advantages of the pull model:

All changes go through Git, so you can find out who made them and check out the commit text to learn what they were made for (if you’re tracking that, of course)
A guarantee that the deployment in your infrastructure is exactly the same as described in this repository. In the case of the push model you cannot understand the state, because everything is applied from different CD Jobs
Easy to make changes to multiple projects at once: everything you need is in the repository

Tools:

ArgoCD
Flux
Weaveworks

In the next article, we will take a closer look at ArgoCD, since it is the most popular tool that implements the GitOps approach.

What’s next

In this part, we’ve described how approaches to deploying apps evolved and had a brief glance at the GitOps model, an alternative to the classic push model.

In the second part, we will explain how the Argo CD works and how the GitOps approach works.

In the third part, we’ll tell you how the infrastructure is managed.