A Modest Git Branching Proposal
Working on a large software project using Git, which is now all but the only choice for source control means a lot of branches are in simultaneous use. Look at the graphic below and try to imagine maintaining a process that keeps all of those correctly synchronized. This is a relatively standard approach to branch management, regarded as simple, and it has five types of branches.
We can’t work without branches anymore. They can be confusing and we have to be very attentive to which one we’re currently working in, it’s very easy to start coding into the loaded one and realize that it’s not the intended. But even with the greatest diligence and care, it’s possible to lose work. With a lot of branches, it’s imperative to follow the exact procedure but even so, there will be mistakes, we need to go one step further: keep the branching as simple as we can manage
The reader is expected to have some familiarity with Git and its most basic operations. This is not an operating manual for Git or its interfaces (command line, TFS, Git Desktop) and there will be no command-line statements or instructions.
First Principle: Simplicity
Our work is complex. We do not merely assemble pieces like Lego blocks, though we endeavor to create and use such pieces; were there nothing more to creating software it would not be “development,” it would be “assembly.” And to do this we need tools.
We began writing binary code, then we wrote assemblers, then compilers, then editors, debuggers, integrated environments. Now we have vast arrays of tools for coding and team coordination and each of them entails some endeavor before it becomes a benefit and not an impediment. Some never do.
And as our palette of tools grew then they too became complex.
Our tools must solve problems for us, not become projects — problems — of themselves. That emphatically includes Git. Because if you are following Git procedures that offer opportunities for error then those procedures are making your work harder and distracting you from your creativity and your focus.
There are numerous standard branch strategies inc common use and some groups simply manage branches in an ad hoc fashion. The diagram below
is difficult to even follow and hard to imagine working well very long in practice. Every additional branch is another opportunity for costly errors. With many extant branches in a large project, something it certain to go wrong, requiring hours or days to reconcile and retest.
Git is very good but it is not perfect and there are some bugs, if not in Git itself then in some of its tools. It uses a peculiar nomenclature different from other source control tools.
Certain presumptions of operation are woven deeply into Git, fighting them will cause pain, but that are not explicitly stated.
Branch Again and Again
Git documentation suggests branching for all edits but never counsels prudence or limits. Some will create a branch to fix a spelling error in a comment and go through all the rigmarole of naming it, reviewing it, merging it. Comments don’t compile.
I’ve seen a development lead merge in a feature by first branching from it (why?), then branching from that, and again, and again, until what is finally checked in is six utterly unnecessary branches away from the feature.
One Branch at a Time
While in theory one can have any number of Git folders each holding a different branch, this creates such opportunities for error and so violates the expectations of the accompanying tools that to do so is mad as Caligula. Don’t.
Git’s merge operation is not as good as some dedicated merge tools. Often you will see a “conflict” between two edits, absolutely identical even unto their whitespace. Other times an inserted piece of new code will be treated as a change rather than an insertion; for this reason, you should always put new work at the end of the file (more on this below). This is a primary consideration for the operation sequence recommended in this article.
I know this is not supposed to happen but there is no disputing that it does: a merge produces a file with the new work discarded. This happens silently, usually without the report of conflict and can only be determined by careful scrutiny by the developer whose work was lost.
When we saw this happening we reviewed our procedures and made certain that everything was correctly done, that the latest master branch was merged in before committing. It kept happening. One developer who experienced this far more than the others insisted on using Git Desktop instead of the command line. Later we switched to TFS and while these wrong-way merges happened less often, they never stopped completely. Over the 30 months I worked at this company we lost weeks of progress to this despite all the rigor we could muster.
This is the purpose of this article, a minimally complex branching strategy and operation sequence to minimize the possibility of merge conflicts. For this proposal, I am going to use the nomenclature of a web server as an example but there is nothing notably different from any other project.
I am going to avoid jargon and loosely-defined terms like “refactor” here. I am also avoiding the increasingly imprecise and bewildering Agile Newspeak of “stories,” “technical debt,” etc.
More complex Git operations like rebasing are not part of this design.
Think of a branch as more like buying a yacht than buying a cup of coffee. Don’t create a new branch without a clear need: a bug to fix or new functionality to write. Never branch from a non-master branch except to return to a tagged checkpoint (see Tag Branch Checkpoints below).
Nothing is more disheartening in this work than committing some work and seeing a list of conflicting merges in files. This can consume days. And most of it is easily avoided.
The goal of this proposal is to make Git as much a useful tool and as little a project in itself as possible. Too many branching strategies end up creating problems that consume time and energy better spent writing products.
The Branch Types
Every project using Git, even one done at home as a hobby project, has at least one branch representing the current state of the project, a master branch. On a group project, the master is almost always the code behind the user-facing version of your work.
There is one other kind of branch, usually thought of as two distinct types. In reality, they are the same, branched from master and upon completion merged into master, becoming master, and, at the time of the merge, differing from master only in the intended changes made.
However the lifetimes of the two are different enough to treat them as distinct types, but this is in-cid-den-tal. They are both “side branches.” They both have limited lifetimes. Only the master is immortal.
The HotFix branch is created in response to an emergency in the version of the master that is currently “live.” Although this is an exceptional condition I want to present it first because a HotFix branch is a degenerate form of a Feature branch with a strict set of rules.
A bug turns up on the server. It must be fixed in all haste, tested on a developer’s machine and hopefully on a staging server, then merged into master (replacing master) and propagated to production.
A HotFix branch
- is a branch taken from the tip (“head”) of the master branch
- takes precedence over all other work
- suspends all other merges into master
- includes only changes essential to the emergency fix and a few others as possible; its review should be entire of the emergency fix
A HotFix has priority over all other work and while it is in progress then no other branches should be merged into master. Should this prove impossible, if the HotFix takes more than a few hours, be scrupulously careful to keep it updated with every change to master before merging it back, but try very hard to not do that.
I’m using this term to include both the development of new functionality and the fixing of bugs since so far as branch management is concerned they are indistinguishable.
A Feature branch may be the work of days, weeks, even months. While it is in progress it must be updated with every change to master, so when it finally becomes part of the master and goes live the only differences visible are those related to the new feature. These changes will include deletions as well as additions and editing.
During the Feature branch work, which may span months there will be large numbers of HotFix and other Feature branches and as each one becomes part of the master then the updated master must be merged into it. This is vital.
A Feature branch
- is a branch taken from the tip (“head”) of the master branch
- is maintained as a clone of the current tip of the master branch plus the feature-specific changes
- every commit to master is merged into the Feature
In the diagram above
- A HotFix and a Feature are both in progress having been branched from master earlier (not shown)
- The HotFix is finished and tested; it is merged into master and merged into the Feature at the same time (and any other Feature in progress)
- Now the master is identical to the final state of the HotFix and the HotFix is deleted
- Later the Feature is completed, tested, and merged into master
- Now the master is identical to the final state of the Feature and also includes the HotFix
- Delete the Feature branch
This is a maximally simple example with a single feature, but note that no other branches are created.
Other Branch Types
Branching procedures are always good for an argument but they simply offer opportunities for confusion. And repairing merge conflicts is one of the most tedious and frustrating things we can do, an enormous and risky diversion from software development into the software process.
Don’t use the development or release branches in the diagram above. Don’t ever create a branch from any but the master branch except to return to a tagged point in a side branch to discard subsequent work.
Most bug fixes and new features are assigned and tracked in some project management systems like Team Foundation Services, which assigns a number to each task. Use that number and some descriptive names for each branch. This not only makes them easier to track but discourages, by making it just a little tedious, the practice this article seeks to stop: creating unneeded branches.
Tag Branch Checkpoints
If you are implementing a feature in stages and have completed a milestone and want a safe point to return to if the next stage needs to be discarded, then you should tag your HotFix/Feature branch at that point.
If your next work past that safe point needs to be discarded;
- create a new branch from the tagged checkpoint. This will be a copy of the branch with your work before the tag
- merge master into the new branch
- delete the previously tagged branch
- update your task-tracking software to reflect that the work is now being done in the new branch
This is the only time you should ever create a branch not taken directly from the master. That this operation violates the central imperative of this proposal is mitigated by the immediate deletion of the previous branch
Manually Reconcile Two Branches
If you need to diff two branches and copy from one to another,
- sync your Git directory to the source branch
- copy it en masse to a non-Git directory
- sync the Git directory to the target branch
- Using an external diff tool, copy changes into the Git directory
Inserting At End
The merge facility in Git is imperfect and does not work as well as some dedicated merge tools such as BeyondCompare. In particular, if you add a new function in the middle of a file, grouping it with functions related to it (this is a perfectly fine practice, but don’t do it yet), then sometimes Git will see it as a change to the code that was already at that location rather than as an insertion, leading to an enormously complicated manual merge.
You can avoid this by putting the new functions and any other insertions at the end of the file so it unambiguously shows as new work. This saves a lot of headaches.
Suppose you have done as above, inserted new code at the end and now want to properly group it with related work. You may want to do some other work that is not a functional change; fix some poorly-named variables, change a var declaration to an honest type, make whitespace changes. If the merge of this file shows conflicts you can simple Accept Target and ignore the conflicts.
Do such changes as a separate commit of a new branch rather than leave your files in a disordered state because of the potential for merge conflict.
No Branch Nostalgia
Don’t keep old branches around. As soon as they are merged and verified, discard them, locally and in the repository. Accumulating branches are tiresome and sloppy.
This Modest Proposal goes against the prevailing unrestrained proliferation of branch after branch after branch. Merge conflicts and discarded work cost an enormous amount of time and tedium and following this stripped-down and maximally simple procedure should save a lot of that
Marcus Aurelius, Meditations