Spring cleanup: Outdated Git tags

Nils Diekmann
6 min readJul 31, 2024

--

Imagine your repository is flooded with auto-generated Git tags. Most of them are from feature branches you no longer remember anymore. Just as you strive to keep your code clean, you should also regularly clean up these Git tags.

Photo by Tuan Cao on Unsplash

Motivation

Recently I started programming with Go. I have written some modules and in order to use them in other projects, I need to publish them. Go uses Git tags to mark a specific version. The tags should use semantic versioning. I am already familiar with semantic versioning. In C#, I used this simple approach to give my services and NuGet packages a descriptive version.

Photo by Kari Shea on Unsplash

I am a lazy person and I like to automate all my work. So I invested blood and tears in automatically tagging my code with corresponding semantic versions. Having even managed to handle multiple modules in one repository, I am now facing the next challenge.

I use feature branches to develop and pre-release my code. The pipeline generates new Git tags with each check-in, which is useful during development. But how do I get rid of them, after I merge the feature branch? I don’t want to waste my life manually deleting things.

The full code for the article is available on GitHub. The repository also has an open feature branch for demonstration purposes.

Step 0: Producing semantic versioning tags

Semantic versioning is a popular versioning scheme that uses a format like “major.minor.patch” to indicate the version of the code base. My GitHub workflow has an environment variable ‘MAJOR_MINOR_PATCH’ which you must manually update according to the rules of semantic versioning when you develop the code.

Semantic version git tag

For the mainbranch, the semantic version is determined by the environment variable. For feature branches, a pre-release version is defined by a suffix containing the name of the branch and the build run number: v0.1.0-feature-branch.1. The workflow then pushes the semantic version with actions/github-script as a Git tag. Adding the build run number makes the Git tag unique without manual intervention.

Step 1: Delete outdated tags with a script

When I need to automate my task, I start with writing a script that I can run and debug locally. The first step is to have a good design. I want to delete the git tag from the remote repository. Outdated git tags are defined by git tags, that contain a semantic prerelease version and where the corresponding feature branch was already deleted.

  1. I need to update my local version to the remote version.
  2. I have to fetch all git tags.
  3. I have to fetch all branches.
  4. Determine for each git tag if it is outdated.
  5. If so, delete the git tag on the remote repository.
Deleting outdated git tags

git fetch --tags --prune --prune-tags will fetch branches and tags from the remote version and delete any local branches and tags that no longer exist on the remote.

git branch -r retrieves all remote branches. Then the names are normalized by removing slashes and whitespaces with sed ‘s/^*//;s/ *$//. Finally, I filter out my mainbranch with egrep -v “^main".

git tag — list ‘v[0–9]*\.[0–9]*\.[0–9]*-*gives me a list of all git tags, where the name starts with a v followed by the major, minor, and patch version numbers. It also must be followed by a hyphen - and any character that indicates a pre-release version.

I have the convention that all my feature branches start with featurefollowed by the name of the branch. But the git tag only contains the name of the branch. An example feature/branchname for the feature branch and the corresponding git tag is then v1.2.0-branchname.

  [[ $tag =~ ^v[0-9]*\.[0-9]*\.[0-9]*-(.*)\.([0-9]*)$ ]]
featurebranchname=${BASH_REMATCH[1]}

For each of the git tags I cut the feature branch name out (see code block above) and check if a feature branch with a name according to my conversion exists (see code block below).

if [[$existingfeaturebranches =~ "feature/$featurebranchname" ]]

If not so, then the git tag is first deleted on the origin git push origin --delete $tag and then locally git tag -d $tag.

Step 2: Integrate the script into a CI pipeline

Automating tasks with scripts is a valuable first step, but manual execution is time-consuming. Integrating the script into my CI pipeline, triggered on every code push, would be a more convenient approach.

To create this step for my workflow, I ask GitHub Copilot to translate the bash script into a step for my GitHub workflow. Like me, GitHub Copilot prefers to use actions/github-script. Unexpectedly, GitHub Copilot generated JavaScript code instead of Bash. It seems that JavaScript is the preferred language for GitHub actions.

The logic of how the code fetches the branches has changed. GitHub Copilot missed that I was filtering out the main branch. My Git tags for the mainbranch have the format v1.2.0 and do not include the name of the mainbranch itself. Other than the fact that the main branch is now logged as an existing feature branch, the change does not affect the result.

const existingFeatureBranches = (await github.rest.repos.listBranches({
owner: context.repo.owner,
repo: context.repo.repo,
})).data.map(branch => branch.name);

Another change is that Git tags are no longer filtered to feature branch tags only. Aside from the longer runtime, this does not change the logic, since the other tags do not match the feature branch regular expression anyway. I keep the code as it is and consider this a limitation of the GitHub Rest API.

const tags = await github.rest.git.listMatchingRefs({
owner: context.repo.owner,
repo: context.repo.repo,
ref: 'tags/v',
});

The workflow with the script is triggered every time I push code. I do this a lot to break problems into smaller steps. Since deleting git tags takes time, I do not want to run the script on every push. My repository is configured to automatically delete the feature branch when a pull request completes. After that, the workflow is triggered for the mainbranch. If I run the action on every run of the workflow for the mainbranch, it will fit my needs perfectly.

      - name: Delete Git Tags
uses: actions/github-script@v7
if: github.ref == 'refs/heads/main'

To achieve this goal, I introduce a conditional check within the pipeline using the if keyword. The branch that triggered the current run of the workflow is defined by github.ref. In my case, the name of my mainbranch is refs/heads/main. The step will be now only executed if the two variables have the same value.

Finally, you can also take a look at my first GitHub action that replicates the functionality of the step inside this sample pipeline. The GitHub action is implemented using TypeScript. It’s important to note that this GitHub action is specifically tailored to my conventions and lacks flexibility. Under these miserable conditions, I recommend understanding the underlying concepts before directly using my GitHub action in your projects. Instead, feel free to copy and modify the code to suit your needs.

I’m excited to share my experience and encourage others to explore the potential of GitHub Actions for my automation needs. Thank you for reading my story until the end. ❤

--

--