Rewriting history — an adventure with Git

At Photocrowd, we have been working hard on a big new feature launch. After 10 weeks of iteration, we’re nearing the launch. The feature requires a certain amount of curated data to be created in advance of the public release, the nature of which wasn’t known until very late in the process. In fact the whole nature of the project, what we aren’t or are not including, has changed regularly through the process.

The mission

We want to add the models, migration and admin files for some applications to the master branch. These currently have several iterations, so we may as well reduce it to a single migration at the same time.

Simplification

The feature branch started out with 187 separate commits, including at least a dozen “fix tests” commits. The first step was the squash these down into logical chunks of work. This was a pretty simple use of rebase, if a little time consuming. I didn’t necessarily worry to much to ensure every new intermediate point was fully functional, just grouping the commits logically by area helped a lot. I ended up with rebase plans like this:

Starting point:

pick 1aaaaa Prototype feature A
pick 2aaaaa Improve feature A
pick 3aaaaa Prototype feature B
pick 4aaaaa Fix tests for feature A
pick 5aaaaa Improve feature B
pick 6aaaaa Feature B front end
pick 7aaaaa Feature A front end

Plan:

edit 1aaaaa Prototype feature A
fixup 2aaaaa Improve feature A
fixup 4aaaaa Fix tests for feature A
fixup 7aaaaa Feature A front end
edit 3aaaaa Prototype feature B
fixup 5aaaaa Improve feature B
fixup 6aaaaa Feature B front end

Split it up again

Having got logical chunks of work, we can then see where all the pieces we need are. This was generally spread across two or three of our new commits, with the addition of an app first, then some models, then later changes to the models and improvements to the admin. The next step then was to split those commits up into the part we wanted to ship early and the rest of it.

$ git rebase -i
edit 1aaaaa Add feature A
$ git reset HEAD~
$ git add feature/models.py
$ git commit -m "EARLY: Add feature A models"
$ git commit -am "Add feature A"
$ git rebase --continue

Pulling it together

Next, we pulled the new part commits up to the top of a new rebase plan, and squashed them together.

$ git rebase -i
edit 3aaaaa EARLY: Add feature A shell (__init__.py, settings.py)
fixup 7aaaaa EARLY: Add feature A models
fixup 9aaaaa EARLY: Improve feature A models
... other commits

Rewriting migrations

Last step, we’re nearly there! We then went back to our new first commit and edited the migrations in it to squash them into a single migration. The changes we pretty simple so we did this by hand, but you could have just trashed the lot and recreated initial ones. Be careful that if you do have further migrations which depend on them, you need to also edit the commits where those are added to change their dependencies.

Some advice

It’s really important to get clear in your mind two things before you start. You need to know exactly what code you want to pull into the new initial commits, and you need to know roughly when in the long commit history that work was done.

You should also not be afraid to do many, many rebases, doing a single thing each time. Push the results up as new branches if you’re unsure about what you’ve got, and let your CI tell you if the end result is still valid. It’s likely you will end up with some issues later on.

Squashing and splitting is easier than reordering as it reduces the chance of conflicts. It’s also closer to what you do on a daily basis.

A note on feature flags

We could have feature flagged everything, merged to master regularly and then just amended the feature flags to keep everything invisible apart from the bits we needed. The decision was that this would be time consuming and error prone to maintain, and also require extra work to tidy up afterwards. As the whole team has been working on this feature most of the time, keeping it up to date with smaller changes on master has been pretty easy. Also the large number of year-old feature flags still not tidied up in the code was somewhat off-putting to following this plan.

Show your support

Clapping shows how much you appreciated Marc Tamlyn’s story.