A practical guide to GitLab Runner Custom Executor drivers

Hints and tips to create your own driver from scratch

Published in

CI&T

10 min readJul 27, 2020

GitLab is rich and scalable when it comes to software architecture, and Runner is one of the key components of such an ecosystem. It is responsible for running CI/CD jobs and deployed decoupled from the GitLab instance.

Runners come in multiple flavors: Docker, Kubernetes, Shell, SSH, VirtualBox, and others. Technically speaking, each “flavor” is implemented as an Executor. But what if you want to run the CI/CD jobs in an infrastructure that is not supported by the native executors or to scale your Runners fleet in a custom strategy that better fits your needs? There is a special type of executor that helps to tackle this: the Custom Executor.

This gives you the control to create your own executor by configuring GitLab Runner to use some executable to provision, run, and clean up your environment.

The Custom Executor works as a bridge between GitLab Runner and a set of binaries or scripts (aka executables) you must develop to set up and use your CI/CD environment. This set of executables is called Driver.

GitLab Runner and the Custom Executor are fairly well documented as you can see in the links provided so far, what it is pretty useful for those who want to develop their own drivers. But I believe there is space for complementary documentation, bringing a practical approach to present details to be taken into account when developing drivers for the Custom Executor, and this is the goal of this blog post.

Disclaimer: all opinions expressed are my own, and represent no one but myself. They come from the experience of participating in the development of a driver for GitLab Runner’s Custom Executor, which implements support for AWS Fargate.

I’ve created a didactic driver that will help me to illustrate the reasoning behind the text: the std err-out driver for GitLab Runner Custom Executor. It is intended to provide us a friendly output, clarifying as many details as possible about CI jobs execution, as shown in the below screenshot:

Didact output from the **std err-out driver for GitLab Runner Custom Executor**

This example driver actually supports executing CI commands, although (believe me!) this is not its main goal. It was developed in Go and the source code is simple enough to be understood even by those who are not familiar with the language. There is also a companion repository intended to demonstrate the driver in action: gitlab.com/ricardomendes/stderrout-demo.

Having that said, let’s dig into some of the Custom Executor drivers’ characteristics!

CI/CD job stages as a command-line interface

To get started, bear in mind GitLab CI/CD jobs are executed in stages:

Config: allow us to change specific job settings at runtime;
Prepare: to set up the environment — create a virtual machine or container, for example. After this is done, Runner expects the environment to be ready to run the jobs;
Run: executed multiple times in order to process the “operational” steps of a given job — e.g. get the sources, manage cache, build and test the code (using instructions from .gitlab-ci.yml), and upload artifacts;
Cleanup: to clean up any of the environments that might have been set up.

A job-executing workflow through the Custom Executor is split into three major components, as shown in the below diagram:

Components of the GitLab CI/CD job executing workflow for the Custom Executor

Communication between a Runner and the Custom Executor is under GitLab’s responsibility, which means it is out of the scope of the present article. — Nice, but how does the Custom Executor communicate with the driver? — Through a command-line interface!

After registering a Runner with the Custom Executor, we need to edit /etc/gitlab-runner/config.toml (or equivalent file, depending on how GitLab Runner was installed) and provide the paths and arguments to the executables that compose the driver. Please take a look at the following Runner configuration snippet, it is the std err-out driver’s sample configuration:

[[runners]]
  name = "__REDACTED__"
  url = "https://gitlab.com/"
  token = "__REDACTED__"
  executor = "custom"
  ...
  [runners.custom]
    config_exec = "/opt/gitlab-runner/std-err-out-driver"
    config_args = ["config"]
    prepare_exec = "/opt/gitlab-runner/std-err-out-driver"
    prepare_args = ["prepare"]
    run_exec = "/opt/gitlab-runner/std-err-out-driver"
    run_args = ["run"]
    cleanup_exec = "/opt/gitlab-runner/std-err-out-driver"
    cleanup_args = ["cleanup"]

As you can see in the [runners.custom] section, the stages of the job-execution workflow are mapped to command-line instructions, e.g. /opt/gitlab-runner/std-err-out-driver run is responsible for executing the Run stage. I decided to use a single binary that accepts the config, prepare, run, and cleanup arguments to distinguish between the stages, but they could be replaced with four distinct binaries or scripts depending on actual use-cases. The source code for each CLI option is available here.

The Custom Executor may provide additional input arguments at runtime and, currently, it only happens in the Run stage. Its executable receives two strings: the path to the script to be executed and the name of the sub stage it represents. You will find more information about scripts in the GitLab Runner generated scripts section below.

Each stage is executed in its own isolated process, which means no state is shared between them unless you implement some persistence mechanism — using a temporary file, for example.

stderr and stdout handling in each stage

One of the trickiest things I found when started to work with drivers for the Custom Executor is to understand how stderr and stdout are handled in each stage. Yes, it changes depending on the stage!

Sidenote: If you want to check the source code of all stages implemented in the std err-out driver, please refer to this folder. Notice I used two libraries to write content: fmt and logrus. Fmt’s Prinf() and Println() always write to stdout, Logrus’ Info() writes to stderr by default, but can also write to stdout, as I do in the Cleanup stage.

CONFIG

stderr is printed to the job execution log;
stdout usage is optional, but if used it must contain a valid JSON string with specific keys. Remember the goal of this stage is to allow us to change some driver’s settings at runtime.

The std err-out driver prints the content below to stdout:

{
  "driver": {
    "name": "std err/out",
    "version": "0.1.0 (HEAD)"
  }
}

Which results in the highlighted line in the job execution log:

Sample output for the `Config stage using` the **std err-out driver**

Please refer to the Config stage docs to get more information on the available settings.

PREPARE

stderr and stdout are printed to the job execution log.

RUN

stderr and stdout are printed to the job execution log.

CLEANUP

stderr is printed to GitLab Runner logs at a WARN level;
stdout is printed to GitLab Runner logs at a DEBUG level;
this means stderr and stdout are never printed to the job execution log.

Sorry for the long paragraph, but there is something curious here… The Cleanup stage is executed even if one of the previous stages failed, which makes perfect sense since the job may fail, for example, at the Run stage, but the resources created in the Prepare stage still need to be released. In spite of previous success or failure, I believe that, in most cases, the code for releasing such resources is pretty much the same. And I don’t think it makes sense to use stderr and add WARN messages to GitLab Runner’s logs in order to record expected behavior when disposing of resources. The remaining alternative is to print non-error logs to stdout, right? But stdout from this stage is printed to GitLab Runner logs at a DEBUG level, and GitLab Runner's standard log level is INFO, so all information from this stage might be lost unless you set GitLab Runner’s log level to DEBUG.

In case you want to set GitLab Runner’s log level to DEBUG, you can edit its configuration file (usually /etc/gitlab-runner/config.toml), and add log_level = "debug" to the global section. Don’t forget to restart the Runner, e.g. sudo gitlab-runner restart. Setting the log level to DEBUG turns the whole output more verbose.

GitLab Runner generated scripts

Generated scripts are another key concept of GitLab Runner and hence Custom Executor drivers. Users describe their CI jobs as lists of commands, and GitLab Runner adds some others when executing the Run stage. The get_sources, restore_cache, and download_artifacts are good examples of sub stages that are managed by GitLab Runner: if you’re used to GitLab CI/CD, you may have noticed there’s no need to provide commands to make such things happen, although we can fine-tune them through configurations.

Commands are encapsulated in scripts at runtime and the Custom Executor provides them to the driver. Let me use the Run/prepare_script sub stage, which uses a simple GitLab Runner generated script, to illustrate them:

Sample output for the `Run/prepare_script sub stage using` the **std err-out driver**

According to the official docs, the prepare_script consists of simple debug information regarding the machine a given CI job is running on. Let’s take a look at specific lines in the above screenshot:

20: presents the path to the generated script;
22–26: shows its commands;
28: shows the script execution output (this was the only script actually executed in the scope of this demo).

In the complete job execution log, you can see that other scripts are generated to fully process the Run stage. They were not printed because sensitive information might be exposed — more on this in the next paragraph. They weren’t executed either, to make the demonstration as clean and simple as possible. Instead of executing the actual commands, I created a function to simulate long-running processes, which is responsible for generating output similar to:

Simulating command 1 of 8
Simulating command 2 of 8
Simulating command 3 of 8
Simulating command 4 of 8
...

GitLab Runner leverages environment variables to add a variety of context data to the generated scripts. They bring not only useful but also mandatory fields that lead to secure and successful job execution. Emails, passwords, tokens, and certificates can be among them, and this is why I avoided printing most of the scripts. GitLab takes security seriously and those values are never printed to the job execution log in common scenarios! It would be my fault to expose them by printing the scripts.

Sidenote: you can fork the std err-out driver repository, change a few lines in the Exec() function of the Run stage, and see the actual scripts. I have done this for learning purposes, running the pipelines in a private repo, and deleted the execution logs after getting the information I was looking for.

Let me share a piece of information about the scripts’ length before finishing this section: commands provided by the user through the .gitlab-ci.yml file are executed in the Run/build_script sub stage, and don’t be surprised if a simple command such as npm run test turns into a 6,000-8,000 chars long script. — Really? — Yes, you read it correctly! Scripts become “hugely big” when the CI/CD environment variables are appended to their contents. This should not be a major issue in most use cases, but it may be worth considering it when making design decisions.

CUSTOM_ENV_ prefixed environment variables

The environment variables appended to the scripts come from GitLab CI/CD predefined fields and used-provided data at the project and job levels, with the same names they were declared.

They are also exposed to the driver’s executables at runtime. However, you might need to add the CUSTOM_ENV_ prefix to the names of some variables if you want to read them from your code. The reason behind this is explained in the Custom Executor docs:

Both CI/CD environment variables and predefined variables are prefixed with CUSTOM_ENV_ to prevent conflicts with system environment variables.

Please notice all environment variables provided by the Custom Executor host system are either available, but they are not prefixed with CUSTOM_ENV_. This means, for example:

CI_PROJECT_URL, a CI/CD predefined variable, will be available as CUSTOM_ENV_CI_PROJECT_URL;
BUILD_FAILURE_EXIT_CODE, an environment variable to be used by the executable as an exit code to inform GitLab Runner that there is a failure on the user's job, is available with its original name since it is not a CI/CD variable.

I’ve run a slightly modified version of the std err-out driver to demonstrate this behavior. Take a look at lines 5, 6, 8, and 9 in the below screenshot:

Checking environment variable values `with` the **std err-out driver**

Termination signals

CI/CD jobs may take a long time to run and they might be interrupted for a myriad of reasons, either by the user or GitLab (under certain conditions). Based on these assumptions, I encourage you to instrument your driver with a termination signal handler since the early days of development. This feature allows the driver to gracefully shut down when specific events happen.

I’ve added a simple handler to the std err-out driver in order to demonstrate how it works. It listens for SIGINT and SIGTERM and cancels long-running jobs if any of these signals are received (context, a Go builtin package, was used to tackle this).

The log available here shows the result of clicking the Cancel button when a job is running.

Closing thoughts

What a long text, isn’t it? As I mentioned at the beginning, many things written here are intended to be complementary to GitLab Runner and the Custom Executor’s official documentation, which are indeed deep.

I tried to share experiences on concepts and technical details my team and I wish to know when we started to develop a driver, explained from developer to developer, in less formal language and structure than the official docs :).

Actual drivers require much more sophisticated design, but I hope this text and the sample std err-out driver help other people to get started with the basics.

Thanks for reading!