Explaining Gitlab-Runner: research

Konstantin Makarov
Scum-Gazeta
Published in
5 min readFeb 18, 2021

Hi everyone!
While doing one of the tasks, I needed to reproduce the logic of the Gitlab-Runner. I was delighted, as I had long wanted to understand the logic of its work.

When I first met the runner, I was delighted, he dutifully followed all my sophisticated instructions. I could log in and arrange the house for myself.
Magic? And as it turned out, there is no magic.
The only thing that bothered me was my “pending” tasks on common runners :)

Gitlab-runner is a workhorse that we will use for CI/CD processes.
All the tasks for building our go-project, launching linters and tests and subsequent deployment to the service are classic tasks for a runner. You can read about them in the Gitlab documentation.

Here you can see their role in the overall CI/CD ecosystem.

Runners are of different types to solve different problems. Namely, they differ in the executer — execution environment.

  • SSH
  • Shell
  • Paralels
  • VirtualBox
  • Docker
  • Kubernetes
  • Custom

Here are the types of executors that can even help us solve some specific problems.
Today we will look at the simplest of them — the shell.

The Shell executor is a simple executor that you use to execute builds locally on the machine where GitLab Runner is installed.

The runner is a normal intermediary between Gitlab and an execution environment (executor). We receive a task from Gitlab and send it to execute (shell).

Gitlab needs to know who it can delegate its job to. Job types can be specified with tags. Runners are marked with the same tags during registration. Thus, we help Gitlab choose the right runner for a specific task.

Also, runners can have different visibility zones — common, for a group or for a specific project ..

The first thing we need to do to work with the runner is to install it and register it.
Since we are Gophers and the runner is happily written in Go, we will not install the runner in the usual way, but rather run it in a debug mode.
Here’s our test subject: https://gitlab.com/gitlab-org/gitlab-runner

Let’s go!

Registration (command: register)

Launching a project in a registration mode. The application kindly asks us to answer a few questions — the theme is thus formed by the configuration file that will be used at startup. We also notify Gitlab about our registration.
The product of the registration stage would be a configuration toml file (~/.gitlab-runner/config.toml) similar to mine:

Out of the main settings, these are your Gitlab URL and registration token, which you can find on the CI/CD setup page of your repository.
This concludes the registration of a new runner.

We go into our repository and see that our runner has appeared in the repository settings. He has not been online yet, though.

Run (command: run)
Let’s run the runner and see what happens:

Good enough — we are now online!
From now on, we are ready to accept jobs from Gitlab.
We started in a concurrency mode, which means we can perform several tasks at the same time. You can also run in a run-single mode.

Let’s take a look at the runner’s algorithm:

  • we launch the application as a service, set up the infrastructure (logger, monitoring, profiling ..)
  • we warn the user if we are not running as root (since the tasks will be performed under this user), I was running as root.
  • using the channel, we manage runners launch (feedRunner), which in turn creates a pool of workers limited by the number from the config (Config.Concurrent)

Status: pending

Then each runner prepares for the task. Gets an executor from the global list of service providers and looks at what an executor is capable of taken current infrastructure.

Now we are ready to perform one of the main functions of the runner — we receive a task from Gitlab using its API.

The task is issued for us according to a certain algorithm, which is defined by the Gitlab server.

If there are no tasks, then we exit and find ourselves again in the cycle (processRunners) and everything is repeated again … we are waiting for the task again.

Here is my .gitlab-ci.yml file:

The task comes in the form of a certain stage, at which the scripts (build) are executed sequentially — this is what our runner will do.

task debug

Create a Build and return the runner back to the channel — it is ready to receive buildings again.

Status: running

Run the task and wait for one of the signals to arrive:

  • SystemInterrupt
  • buildFinish
  • buildPanic

We will not review the execution of the task in detail, so as not to complicate the article.
I would like to abstract from this, since the type of executor can be any, but in simple words:

  • get repository data according to Git Strategy.
  • use the cache
  • load artifacts from previous stages
  • run the before_script section
  • run our task (scripts)
  • run after_script section
  • depending on the received signal of the completion of these signal tasks, set the status (failed, success, terminated).
  • marks the build execution as complete

Epilogue

Studying the runner project, I concluded that it was written by a person with extensive experience in another language (maybe i’m wrong). It seemed to me that the code was overloaded with unnecessary abstractions. Although it is possible to get along without them — let’s try it ourselves.
It is also morally outdated — it often happens to me, when my experience goes to the next level, then looking at overpast projects could be not very pleasant.

In such cases, I just rewrote my old projects.
But from a business and a wifes point of view, remember — this is usually not the best idea! :)

In the next article we will try to write a Gitlab-runner with the minimal functionality using proven design patterns such as Solid & clean architecture.

--

--