How Square writes commit messages
Explain the change.
Commit messages are an important form of communication for us at Square, so writing them well is a great way for us to give time back to each other. Often we wish we’d done a better job earlier. The commits from those days are still in our history, and many of them are still very important.
We’ve collected our thoughts on how we invest in great commit messages that save us time not just immediately, but down the road as well. We’re sharing them here in the hope they’ll also be helpful to your team.
TL;DR: Great commit messages explain the change
Here at Square, we write a lot of code. At its best our code is fairly self-explanatory. Where our programming language is insufficiently expressive, there are comments in the code explaining what the code does, or even better, why it does what it does. If a developer checks out a SHA and looks at the code, she should be able to understand what it does and why it works the way that it does.
Commit messages aren’t about the code, they’re about the change itself. This is a crucial distinction that is often lost.
When we make a commit, we’re changing some ideally very clear, self-describing code to some other very clear, self-describing code. No matter how clear the before-code and the after-code, there’s a contextual gap between them. The commit message bridges that gap.
Commit messages have two audiences
Consider the readers of a commit message. They basically fall into two camps: reviewers and archaeologists.
The immediate audience is likely code reviewers, who must decide whether they approve of the change, and whether the actual change in code correctly reflects its author’s intent. Reviewers need context on the change itself in order to evaluate it, and it’s much easier for them if the change comes with a message providing that context than if they have to deduce it from the before and after.
Longer term, the audience is likely developers attempting to fix a problem in the software, remove code that’s no longer needed, build atop existing code, just make similar changes, or some combination of these. Sometimes, it’s a developer just trying to understand a past decision (often the author!). This is archaeology, and it’s dramatically easier if each commit comes with a helpful note explaining the change. Otherwise the archaeologist is left trying to reconstruct a commit author’s thought process from nothing more than the code before and the code after– often an impossible task.
Really, they have three audiences
Our great commit messages also help us to promote our best engineers. They provide context and detail that help us evaluate the technical acuity in an engineer’s body of work. Equally important, our best commit messages show our skill and empathy as communicators. Clearly-written, insightful, helpful commit messages are a clear sign that an engineer is building with foresight and consideration for other people. Evidence of technical ability, empathy for others, and effective communication are all key to earning a promotion at Square.
How we do it
Are we fixing a bug? We’ll briefly explain the expected and actual behavior, and how this change is related. Are we introducing a new feature? We’ll summarize it.
It’s helpful to assume that the reader has no idea why we’re making a change. For many readers, this will be true — for even the smallest, most “obvious” change. Writing down the motivation is especially crucial.
This would seem to go without saying, but in the course of responding to reviews or otherwise iterating on a change, it’s easy to overlook updating the commit message. To guard against inaccuracy, we authors and reviewers double-check our messages before merging.
We explain why the change is safe
Especially when we’re touching infrastructure code or doing something a little unusual, it’s worthwhile to record why it’s ok to do what we’re doing. It reassures readers and puts our considerations and assumptions on the record in case there are problems later.
We point to resources with more context
Jira for sure. Design docs, google groups threads, Slack threads as appropriate. The more context we can put around a change, the easier it’ll be for archaeologists to understand it.
Of course, simply linking to a Jira ticket or discussion thread isn’t very helpful. Why make readers (who may not even be reading in a browser) go somewhere else to get any understanding at all of the change? They might not bother, and might miss crucial detail. They also might not be able to! For example, once upon a time we used Github Enterprise. Now all of those PRs on GitHub are lost. Tools change over time, but the commit log is forever.
Jira and Stash are helpful for capturing the conversation context, to understand the back and forth that happened while we decided on a change. Our commit messages capture the end result of that back and forth, and should stand on their own if links go dead.
We know the medium
Our messages will be read in a variety of tools and contexts — the pull request, IDEs, `git log` command lines that may show abbreviated or full messages, etc. There are some important things to know about writing in this medium.
- Commit messages have subject lines. The first 50 characters of the first line of a git message are treated specially by most tools. In many cases, such as reading through`git log` output, this is the only text that a reader will see without taking an additional step. Make every effort to summarize your commit in 50 characters. Include a JIRA issue ID here, if at all possible.
- Most tools want to see an empty line between the subject and body of the message.
- Commit messages are often read from command line tools, and often with git-log branch lines (“railroad tracks”) to the left of them. It’s helpful to wrap lines, typically around 72 characters (to leave room for “tracks”) but certainly within 78.
We don’t cross the streams
We’re careful to choose the right medium for our messages.
- Architecture and broad decisions are written up in design docs.
- When explaining what some code does and why, we write code comments.
- For changes in the code, we write commit messages. (You knew this one!)
- Finally, for anything reviewers care about but archaeologists won’t we write comments on the pull request.
Note: Cross-referencing among the above is encouraged!
Sometimes we get it wrong
As you might be able to tell from above, a lot of thought goes into a good commit message, and like most things it takes practice and help from each other to build the habit. Here are some common ways we go wrong.
We sometimes write messages like “Fixes <bug>” or “Adds <functionality>”. Sometimes those messages don’t even describe the bug or functionality. These are just hints and clues, really, exercises left to the reader to complete.
Explaining the code
Describing new code is different from explaining the change. Sometimes we’ll try to be helpful by describing the mechanics of the new code, how it’s supposed to operate. Code comments are better suited for that, though. Sometimes we’re tempted to explain high-level design or architecture, which better lives in a design doc that we could just link to. At any level, explanations of code have better homes than the commit message.
Transliterating the diff into English
This one’s really common, unfortunately. In our haste, we’ll write a message that just describes the change tersely and literally. “Deleted foo”, or “Renamed bar to baz”. This is at least the right idea, but messages like this are too literal. They lack the context, motivation, and assurance of safety that readers are looking for.
Here’s an example of a really unhelpful commit message I wrote. Sorry, team!
You can see that it provides no context, provides no motivation for the change, sort of just describes the code. I have no idea why I wanted to change that request from a POST to a GET. Was there a bug? Probably, but what was it? Was it with the request itself? Was I working around a bug in our REST library, which may now be fixed? Should this change be reverted at some point? What exactly was wrong with POST, and why did I think GET would be better? So many questions! Not even I know the answers now.
I think I did a much better job here:
This time, we have a lot more answers. There’s some history, including a link to where the problem was introduced for further reading. I explained the bug that I was fixing in enough detail for the reader to understand it (at least I do, eight months later). Finally, with that supporting context, I explained briefly what I was changing, and why it would fix the problem. This is much more helpful to me today, and I hope to others.