Goodbye Cronjobs; Welcome Gitlab pipeline

or how I run automatic database backups with Gitlab CI/CD

Published in

The Startup

5 min readJun 26, 2020

Gitlab, as I’ve already mentioned in my other articles, is really powerful tool. It is not only a Git repository manager anymore, you can do much more with it. And by much more I mean really much more. One of its feature I want talk about today is scheduled jobs.

Of course, it is not unique feature of Gitlab, but it really nicely contributes to the whole DevOps ecosystem and you don’t need to use different services or setting up your server.

I’ve been testing it for a while and I’ve decided to move all my backup cron jobs from servers directly to the Gitlab. And I’d like to show you how easy it is on a simple example — to backup a database.

In my example I have a basic LAMP stack - running in docker. And I want to backup its database. Until recently I had another docker container*, which ran a cron job everyday to connect to database and run mysqldump.

*docker container was required because I host multiple websites, and I didn’t run this job only for one DB container, so every DB container had its own backup cron job.

It worked well, but the checking the status of job is quite laborious — connecting to the server, parsing the log etc. And also in a case of failure I didn’t have any notification. So I decided to give a try to a schedule job.

here’s an example of gitlab-ci.yml:

variables

In my script I need following variables to successfully run my script:

DB_EXPORT_PATH — set in gitlab-ci.yml, path where the backup will be exported to
DB_CONTAINER — set in gitlab-ci.yml, name of the container serving as my database server
HOST_PRODUCTION —set in gitlab-ci.yml, user@server of the destination server
PROJECT_NAME — set in gitlab-ci.yml, name of the project, used to set backup file name
DB_ROOT_PASSWORD — set only for the job — in scheduled pipeline job setting (more later)
SSH_PRIVATE_KEY — private ssh key of the server, which has access to the production, best stored in global CI/CD variable, needed for SSH connection — read more

There are multiple options where to set them:

directly in the config file — not suitable for secrets or other sensitive variables
Gitlab CI/CD variables — general variables, good and secure solution for project wide variables
Scheduled job variables — variables required only by the scheduled job

Job settings

image — basic alpine is sufficient; in the case gitlab-runner is configured as docker, you won’t need image at all

stage — it is not very important which one, but build seems like the most logic one

only — schedules so I can use it in scheduled jobs; web so I could create a pipeline manually, but it is not necessary to run them scheduled

before_script — installing ssh client and adding private to key, so it can connect to the server via ssh. More info here.

script

removing old backups, in my case I keep last 10 files
creating file path in format
/path/to/folder/ project_name-20200626–030000.tar.gz
running mysqldump on the Mysql server in the docker container via SSH

docker exec DB_CONTAINER mysqldump -u user --password=mysecretpassword --all-databases | gzip -c > filepath

Optional: disabling other jobs

To avoid running other job (build, deployment, test) in the same pipe, we need to add except parameter them by adding this inside these jobs.

Configuring the schedule job

Configuring a new scheduled job in Gitlab is pretty straight-forward. In you project you have an option Schedules under the CI/CD menu.

You click on create new, you give it a name, set when to run it (in my case every day at 3.00 AM), which branch to use and add necessary variables. In my case I store here the target database root password. Then just click on save.

And that’s it. The job will run from now on every day at 3 AM or I can launch it manually if I need it.

What is the advantage of using it?

1. Specifically in this case, this scheduled job replaced another docker container on the server, which ran only to do this cron job. So now my server has more resources for other tasks.
2. It is centralised — all my tasks like backup, deployments etc. are now centralised with full log inside the Gitlab. I don’t need to read through the log file to see outputs, I can see it inside in a nice GUI.
3. I could create one single project for all backup tasks for all my projects, so it would be easier to manage. Every cron job could be in one place, regardless the projects, type of task or server when it should run.
4. I can trigger it manually — While manually triggering a cron job, especially if you need run multiple of them, could be quite time consuming. In gitlab is as easy as on click.
5. Notifications — Gitlab supports multiple way of notification and in case of error I will receive a slack message or email.

Notification via Slack about the execution of job

Conclusion

My example is only about basic job to back-up a database, but you can literally replace any of your cron jobs.

As you’ve probably already understood, Gitlab CI/CD has became quite important part of my workflow and thanks to these scheduled jobs I don’t use cron jobs on servers anymore.

in the case you’ve missed:

How to deploy to the Google Run with Gitlab CI/CD

Recently I wrote an article about using Google Cloud run a hosting option for a (small) Wordpress website. You can read…

medium.com