Organize your development to avoid big pull requests

BenjO
swile-engineering
Published in
5 min readMay 13, 2022

aka “How to efficiently split my work into smaller chunks?” or “I was told that my PR is too big, what can I do?”

Don’t smile, we’ve all seen this!

As a developer you kind of immediately “feel” when a PR is doing too much. Your team might even have set a size limit. Amongst other things, big PRs tend to increase the technical debt and decrease the reviewers morale.

This article won’t tell you what ideal size a PR should be. Instead it gives you clues to organize and split code if you need or want to.

Knowing how to efficiently split your work into smaller units is a valuable tool to have in your arsenal. It gives you a perfect understanding of the changes made and allow reviewers to focus on one scope at a time.

This is a process that is easy to learn and hard to master. Here is what works for me 👇

You can use this workflow at different stages of your feature lifecycle. If it works for you, ideally the sooner the better because it avoids conflicts :

  • before/while coding the feature
    🤩 (ideally what you should aim for)
  • after coding a PR and before submitting to review
    🙂 (nice but takes you a bit of time)
  • during PR review, if a teammate asks for a split or you get no reviews at all
    😬 (suboptimal because you and teammates spend time)

Now let’s detail each step. I encourage you to write everything down the first couple of times, it helps visualize.

I’ll use the same example to guide us.

Let’s say you have to get a bunch of reports from network, store them in a database and display them onscreen.

Step 1: List

First you want to write down all changes you did (or will make), even the obvious ones.

At first you may come up with really big changes. Keep iterating on them so they become atomic. Look for keywords such as “and”, “then”, “also”. They often indicate that you can split.

Let’s try again 💪

Here you go, an exhaustive list of atomic changes. And there are quite a few!

Step 2: Analyze

From this list, you can now extract an oriented tree (who doesn’t like graphs?! 🥰). Arrows show the dependency between two changes.

Still, if a box appears to be doing too much things, feel free to go back to step 1.

Step 3: Split

Now is the “fun” part, the split 🤸‍♂️. Here is one proposal, there are many more.
Two things to keep in mind to boost productivity:

  • minimize coupling between PRs. Less dependencies between them, less merge conflicts
  • send PRs to review as soon as possible. Time is precious to avoid parallelization. It’s like passive income, but for code!

First iteration

Start by focusing on the leaves 🍁

The tree leaves indicate the first things you can code without any dependency. Depending on the granularity you want, your project requirements or what your teammates ask for, you may decide to group certain changes.

Here is a possible first split : PR#1 contains the refactoring, PR#2 some data/domain related stuff and PR#3 only the UI part. Note that PR#2 contains multiple changes.

The great thing with this approach is that you can code any of these 3 PRs in the order you want, submit and merge them in parallel. Your time is maximized.

Tips

Splitting an already existing PR can sometimes be painful as you mixed soooo many changes. Here’s what you can do to avoid pulling your hair out.

From your initial PR branch, create another branch.

Use git reset — soft in order to remove all commits while keeping all the changes in your working copy.

Discard any change obviously unrelated to the scope of this PR. It can be a whole file, a function or just a single line. Don’t bother with the details for now, the idea of this first pass is to quickly strip out unwanted code.

Then comes the second pass. You need to make sure your project compile, your existing tests pass, sometimes it’s easier said than done.
You may be dependent on a class that will be implemented later in you split: you can add interfaces with dumb implementations.
You may be embarrassed to merge code that is not fully ready: you can introduce a feature flag first in order to disable the whole feature while its being developed.

Finally make sure to add tests if you missed that part.

While CI is busy working and teammates reviewing, you can move on to the second iteration.

Second iteration

Let’s update our graph and see what’s remaining after first iteration.

Here are our 4 remaining changes. Notice there is still a dependency
PR#4 contains the database schema migration PR#5 the sync between network and database and PR#6 fetches from database to connect the UI

Step 4: Submit

During our first iteration, we coded PR#1 #2 and #3. These were isolated features. Once one was ready, you were able to send it to review while working on the others independently.

But what if you need to start working on a PR while depending on multiple that are in review ?

Because each PR has it’s own branch, you can easily git cherry-pick any commit you need. Once a PR is reviewed, validated and merged you only need to git rebase on your main development branch

PR#C needs specific commits from PR#A and PR#B. A simple cherry-pick can do the trick temporarily.
Once PR#A and PR#B are merged, you can rebase PR#C. Depending on the modifications made during #A and #B review, you may have conflicts on #C. It’s normal.

So that was it! I would love to hear your feedbacks, if you have a whole different process or what steps you do differently to organize your work.

Further reading

https://github.com/google/eng-practices/blob/master/review/developer/small-cls.md

--

--