Repository Cleanup (Merged Branch Deletion)
This blog is for the targeted users who are responsible to maintain a huge source code repository with hundreds or thousands of branches in it and avoid manageability issues later as part of cleanup or space issues on the server.
We are using Bitbucket to store all repositories at large scale. We have tens of projects, each project has hundreds of repositories and each repository has thousands of branches some are active and most of them have merged since years ago. By looking at each repository it was so huge that we have to go through multiple pages to find something, And that’s the pain which triggered us to find a solution to clean up all the merged branches but only older than 30 days to avoid unnecessary issues.
The Issue -
Being a DevOps engineer, I have observed this many times that most of the developers keep creating new branches and merging those to the target branch as per their defined branching models. However they tend to miss deleting those already merged branches which later piles up and consumes high disk spaces, causing slowness of repository access, browsing multiple pages to find specific commits etc. and manageability issues.
The Solution/s -
There are three different solutions we can approach for this issue. I have also added the Pros and Cons of each solution. Let’s checkout those as below -
Solution-1 (Use checkbox)
The one solution available in most of the repository providers is to enable checkbox while either creating a new pull request to delete the merged branch or handle it while merging the request from the repository console or command line by opting the provide checkbox.
For example -
Pros: By choosing this option one can avoid the pileup issue at later. Your repository will look neat and clean with only active branches.
Cons: It will be a pain to remember to select the merge checkbox while approving the PR every time.
Solution-2 (Use Delete option)
Another solution to achieve the same functionality is to delete all the merged branches from dashboard (it depends on repository providers like GitHub, git lab, Bitbucket etc.) by selecting each branch manually which is not feasible for deletion if you have a high number of branches spread across many pages.
For example -
Pros: By choosing this option one can delete all the merged branches in one go but he has to perform this task on periodic intervals which may be manually and time consuming. However it’s better than Solution-1.
Cons: You have to remember a schedule to perform the cleanup task which is still manual.
The solution we developed is to create an automation which can delete the merged branch periodically.
Here is the script developed to handle deletion of all merged branches older than 30 days -
Pros: By choosing this option one can avoid the manual execution and remembering the schedule of cleanup tasks. Use the script as Cron job and run it periodically.
Cons: You may have to provision the system to run the script and appropriate space provisioning at the very first time.
This script is tested on only Bitbucket projects and their repositories. If you have another repository manager like GitHub, git lab, etc., You still may use the actual branch deletion logic. For that, please refer point #4 below.
- It may take more time if your repositories are big enough (in our case it took more than 5 hours to finish the process)
- It consumes local disk space since it clones each repository locally and then deletes later (make sure to have enough space locally to perform this activity.)
- The branch deletion is commented intentionally to avoid any accidental deletion call (# git push origin — delete $branch). Remove the # from this line to enable it.
- If you are using other repository managers like GitHub, Git lab, etc,. instead of Bitbucket you can still refer to the actual deletion code marked in bold as part of the script. (refer function delete_branch_if_merge_longer_than_30_days() ). You can tune the number of days, exclude other branches as per your requirement.
It’s better to implement an automated solution or a process to make sure that branches are deleted after merging instead of option solution-1 & solution-2 mentioned above.