Automatic semantic versioning using Gitlab merge request labels

Published in

inganalytics.com/inganalytics

6 min readJan 30, 2019

A few weeks ago, I was investigating how to do automatic semantic versioning. The solution I’m proposing here is inspired by this blog post. It showed how to set up the pipeline and basic Python script required to automatically bump a repository version based on Git tags.

For the sake of brevity, a way to determine whether a major, minor or patch update should be performed, was left out. This is the actual scope of this post: “Which is the least effort and error-prone way to automatically determine what the next version should be?”

In search of the holy grail

The Three Dots Labs article offers several different options for determining the right version: either based on a specific commit message format or based on Gitlab merge request labels.

I quickly decided that the commit style format was my least favorite option. It heavily relies on an alignment between all engineers regarding the format to use. Besides that, a lot of us tend to use the ‘squash commits when merge request is accepted’ feature in Gitlab. If you squash the commits, all individual commit messages will be lost. Gitlab will merge the branch with a pre-formatted commit message. Since the pre-formatted commit message doesn’t contain any information on what kind of changes are in the branch, this would imply that the engineer would need to manually change the commit message for the merge request to adhere to the agreed commit message format.

But what about merge request labels? Gitlab allows for adding labels to a merge request. I could thus decide to have 2 labels: bump-minor and bump-major. If one of those labels is set, I bump the project accordingly. If none is set, I default to a patch upgrade. This sounds easy enough, but how do I define labels? Gitlab allows for defining labels on a few different levels. A Gitlab administrator can create labels on a global level, but unfortunately, these labels do not propagate to existing groups. A group administrator can also set labels at a group level, as well as project-specific labels

From a user perspective, only requiring someone to add a label to a merge request seems a lot easier than manually changing a commit message to a specific format.

So far, I’ve found the least-effort way to tag a merge request for a specific versioning strategy. However, how can our script, which will run on master builds, detect which label was present on the merge request?

Detecting the versioning strategy for a merge request

As I said earlier, the actual version detecting and bumping will happen on every master build. When a merge request gets merged, either the entire merge request history or a single commit is added to the Git history. Both will trigger a new pipeline for the latest commit on master.

Having Gitlab trigger a new pipeline on changes is great, but how can our pipeline automatically detect that a commit belongs to a merge request, and last but not least, whether or not that merge request had specific labels?

Unfortunately, there’s no way yet to detect whether a specific commit belonged to a merge request. Remember how I mentioned the ‘squash commits when merge request is accepted’ option and how Gitlab will use a pre-defined commit message format?

Let’s assume that I have a merge request, and I’ve checked the squash option. Gitlab will create a commit message that looks like this:

Merge branch 'some-feature' into 'master'Great feature being addedSee merge request group/repository!1

Apart from showing the origin branch and merge request title, it also includes the group and repository name plus the merge request number. Using this, I could utilize the Gitlab API to determine the merge request labels.

There is a caveat though. If I depend on this commit message format, I am implicitly depending on engineers checking the ‘squash commits when merge request is accepted’ option. Unfortunately, Gitlab doesn’t support checking the checkbox by default yet.

This means that this semantic versioning workflow would always require 2 manual steps:

The engineer must determine what kind of changes are included in this merge request and add the appropriate label accordingly
The engineer must manually check the ‘squash commits when merge request is accepted’ checkbox upon creating the merge request

Putting the pipeline script together

Extending upon the pipeline and Python script offered by the Three Dots Labs article, I need to add a few features:

I need to extract the merge request number from the commit message
I need to integrate the Gitlab API to retrieve the labels for that merge request

Using regular expressions, I can easily extract the merge request number from the commit message:

This snippet simply retrieves the last commit message and extracts the group/repository combination and merge request number as separate groups from the last line in the commit message: group/repository!1

To retrieve the labels for a merge request, I need to use the Gitlab API. With Python, there’s a Gitlab API package available which makes this quite trivial:

I create a new Gitlab instance by defining our base URL and Gitlab token. Afterward, I retrieve the project which I identify based on the environment variable set by Gitlab in every pipeline: CI_PROJECT_ID

Last but not least, I use the extracted merge request id to retrieve the merge request.

Authentication

Unlike the Three Dots Labs article, I am not relying upon a Gitlab Deploy Key to push the git tag. The ideal solution for us would be to have a non-user specific access token to do the tagging and access the Gitlab API. Unfortunately, the authentication model used by Gitlab always requires a user object when performing an action. This prevents Gitlab from easily adding support for non-personal accounts.

To get around this, I generally create a user account per team and Gitlab group to use as an NPA. This single account can be used to fulfill our 2 authentication requirements. To share the credentials with the pipeline, I tend to set the username and password as Group-level environment variables. You can configure these by going to the Group variable settings in Settings -> CI / CD -> Variables

Configuring the required authentication variables in Gitlab

The pipeline

So what does the end result look like? A 3 step process is required to set up automatic semantic versioning in any Gitlab repository:

Generate a unique version
Bump the version
Tag the latest tag build as latest

stages:
  - generate-env-vars
  - version
  - tag-latest

variables:
  IMAGE_NAME: $CI_REGISTRY/$CI_PROJECT_NAMESPACE/$CI_PROJECT_NAME

generate-env-vars:
  stage: generate-env-vars
  tags: 
    - shell-tag
  script:
    - TAG=$(git describe --tags --always)
    - echo "export TAG=$TAG" > .variables
    - echo "export IMAGE=$IMAGE_NAME:$TAG" >> .variables
    - cat .variables
  artifacts:
    paths:
    - .variables

version:
  stage: version
  image: mrooding/gitlab-semantic-versioning:1.0.0
  script:
    - python3 /version-update/version-update.py
  only:
   - master

tag-latest:
  stage: tag-latest
  image: docker:18.06.1-ce
  before_script:
    - source .variables
  script:
    - docker pull $IMAGE
    - docker tag $IMAGE $IMAGE_NAME:latest
    - docker push $IMAGE_NAME:latest
  only:
    - tag

The generate-env-vars step uses git describe —tags —alwaysto create a unique tag for every build. On tag builds, it defaults to the last available Git tag. For any branch build, it results in a combination of the last available Git tag and the short hash for the commit. I write the tag and image name to a dot file which I can inject into any subsequent stage.

Wrapping up

I’ve discussed the advantages and disadvantages of using commit message formatting versus merge request labels to automatically update projects using semantic versioning.

Although the merge request label option still requires 2 manual steps from an engineer, it is still quite easy to get used to it. The entire workflow will become even easier when the discussed issues (#706, #12707, #27956) on Gitlab are added.

The entire codebase including an extensive manual and example pipeline can be found on Github. The Docker image can be found on Docker Hub.

Hit me up with questions, feedback or comments if you’d like to discuss this further!