How I managed to not mess up my git history thanks to `git pull — rebase …`

Week 4 GSoC 2020 — First Coding Phase with AnitaB.Org

Maya Treacy
AnitaB.org Open Source
8 min readJun 29, 2020

--

Source: https://www.slideshare.net/GregoryBataille/git-vizualiation-understand-what-you-do

This 4th week of GSoC passed smoothly without too much hiccups. Maybe since I had the basic code made during the previous weeks which helped direct the “style” to be used in writing future codes.

But today I’m going to tell you about my experience managing Git in my project repositories. My workflow in this GSoC project is nothing special (I guess, since I’m not sure how it is done on the “outside” world). I just need to pick up one of GSoC designated tasks, work on that issue by opening an issue specific branch, then push the code changes by opening a PR and move on to the next issue. That’s simple, right (….🤥…😈)

There’s a hidden challenge here that you might not aware of. In reality, most of the issues in BridgeInTech project are dependent of each other since this project is an original idea where codes have to be written from scratch. The project also follows the agile approach, where the backend and frontend are being developed in paralel and tasks are divided into small sprints at a time.

To give you a brief example, the basic setup was the first task (~ task 1) on the line as this laid the foundation of project backend development. The task that followed (~ task 2) was the Register functionality which then followed by the unit tests task (~ task 3) for this Register functionality.

Ideally, mentors approvals are needed on a pull request before I can move on from one task to another. It won’t be too much of an issue if I have a parallel task that I can switch and go to between backend/frontend since they are not normally highly dependent. However, when the parallel frontend/backend tasks are done and still waiting for mentors’ approval, I have no choice but to continue with the next task on the list. The challenge then comes from the interdependencies of these tasks (e.g. the Register functionality (task 2) dependent on codes written for the basic setup task (task 1), and the test cases for the Register functionality (task 3) are dependent on the Register function codes (task 2)). In order to stay productive and stay within GSoC schedule, I need to move forward and working on the next task on the list regardless of mentors approval on PR under review. You can see now why this becomes an issue, right?

For simplicity, from this point onward I’ll refer to the tasks in number sequences and in the following hypothetical cases, all tasks are dependent on their previous tasks.

Potential issue number 1:

Case:

Let’s pretend I have task 1 that is currently under review and I am already move on to the next task (task 2). While working on task 2, mentors provide their feedback on task 1 and I’m making the changes on that branch (task 1 branch).

Problem:

How do I make the code changes in task 1 branch reflected on task 2 branch so I get the desired/approved code to continue working on task 2. Most importantly, can I do this without having to write the changes twice (on task 1 and 2 branches)?

Potential issue number 2:

Case :

This time I have task 1 and 2 that are both being reviewed and am currently working on task 3. While working on task 3, mentors provide their feedback on task 2 and I’m making the changes on task 2 branch.

Problem:

What if mentors approved task 2 and merged it first then review task1 and merged it after? Remember that task 1 has no additional codes that task 2 has since it was the preceding task. To make more sense, think of it with task 2 as the Register functionality and task 1 as the basic setup. Basic setup branch won’t have the Register files since I didn’t write that functionality on task 1. If this task 1 branch is to be merged after task 2, wouldn’t it cancell out (delete) the register folder/file that got merged by task 2 from the first merge? How can we prevent this? Do I need to apply the changes on task 2 backwards to task 1? But this would lead to duplication of codes which is not desirable.

Let’s discuss these one case at a time.

Case one: Syncing code changes across interdependent branches

To start with, to prepare against future “Git” trouble, when opening a new branch I always make sure my local, remote and upstream “develop” branch (which is our main branch for development) are synced. I do this by running the following:

$ git checkout develop

$ git pull --rebase upstream develop

$ git push origin develop

Skip forward, now I have Case one where I need to sync changes across all interdependent branches. I tried git pull before, but this gives me a new merged commit in my git history that I don’t want to see because it’s an unnecessary extra commit to the branch that I’m pulling from. From Meena mentor I learnt that it is recommended to do git pull --rebase whenever I want to merge code changes from one branch to another. So, I gave this a try…. And it works, I could see my changes on the other branch got pulled to my current working PR branch without adding an extra merged commit in the history 🎉.

Warning!! Don’t rush to try doing git pull--rebase before you take the following precautionary steps!!

Important!! You need to know which branch you want to be on top of the history ladder.

In my Case one example, since I have task 1 <= task 2 <= task 3 in order of their dependencies, naturally I want task 3 to have all changes from task 2 and that task 2 should have all changes from task 1.

With this in mind, first I have to git pull --rebase task 1 from task 2 branch. Once done, I will do the same to task 2 from task 3 branch. Confused? Let’s visualize this… 👀

(Note 1: Keep in mind that I always sync my origin and local working branch every time I make changes to my local branch, so for the purpose of simplicity, I will not mention it in the steps below.)

  1. checkout to Task 2 branch
  2. check git log by running git log --oneline --graph

In the image below, my Task 1 branch name is “issue47-create-login-api” and I have one squashed commits named “Feat: User Login API” on task 1 branch. My Task 2 branch is called “test-namedtuple” and I have 3 separate commits on task 2 branch (first 3 from the top of the list)

3. run git pull --rebase origin task-1-branch

4. re-run step 2 to see the effect on your git history

5. once done with the pull rebase on this Task 2 branch, push the changes to the remote by running the command git push -f origin task-2-branch

6. move on to Task 3 branch (branch name: “issue66-login-user-tests”, commit name: “Test: Issue66 test cases for User Login API”) and do the same steps by pull rebasing the other branches, start with the bottom of the line (Task 1 branch first before Task 2)

Image before pull rebase on Task 3 branch

Image after pull rebase all preceding branches on Task 3 branch

(Note 2: In the example given here I didn’t have any merge conflicts when I did the git pull --rebase. This is because these are not the first time I run git pull --rebase on these branches and I had solved any merge conflicts on their initial pull. So, if you come across merge conflict, that’s normal. Just solve it manually and continue (I’m not going to explain the details on solving merge conflicts on this blog, but you can read about it here).

Tips: if the branch you’re going to pull rebase from has lots of mini commits, squashed them into one commit first before you pull. This will save you from the rebase hell (from merge conflicts) since with separate little commits, you’ll have to solve each commit conflicts individually before you can successfully rebase. This article explains it better).

With that, we solved Case one.

Case two: ensuring interdependent branches merged in an “orderly” fashion

Ok, so we’ve synced the branches. Next, we want to make sure that the branches will be merged in the order we want them to be, otherwise, there’s a risk that the code on the former task will cancel out the code on the later task if the later task got merged before its preceding task.

In our example, we want Task 1 to be merged first, followed by Task 2 then finally Task 3. Is there a way to do this easily?

The answer is yes. Luckily, Github has a feature calladd dependency where we can declare if an issue has a blocking issue that needs to be solved first.

As you noticed, my Task 3 above, is blocked by Task 2 issue below, and Task 2 show it is blocked by Task 1.

That’s all for Case two.

Join me in my next blog… 👋

Profile links:

LinkedIn || Github || Online resume

--

--

Maya Treacy
AnitaB.org Open Source

Master of IT student / Software Developer / GSoC20 Student with AnitaB.org / Passionate learner