Forks and Branches for open-source

Git Ninja Training

Naveen Sundar
NITRR Open Source
5 min readApr 13, 2020

--

This story is part 4 of Open Source 101 Series

Previous story: Git for Open-Source

If you are a beginner to git I highly recommend you to go through our previous story where we have laid down the basics. It covers repository, remote, commit, push, pull and clone.

In this story we will explore forks and branches, fine tuned for open-source.

Fork — creating your own copy of a repository

Figure 1: the fork button in github

Open source orgs dont allow people to directly commit code to their repos for security concerns. The contributions are welcome traditionally via pull requests. The contributor uses the forked repo to commit code, which later becomes a pull request.

Branches — are independent lines of development.

Figure 2: Feature Branches & Figure 3: Isolated work places

The branches mainly serve two purposes. Those are, provide separate lines of development, for

  1. different contributors — Imagine your full day of work being messed up by another contributor’s commit!
  2. different features — it acts an isolated place to develop a feature or fix a bug or store a version of the software etc.,

A repository can have one or more than one branches. The default branch is called “master”.

Usually open-source orgs have a branch called “dev” (development). This is the branch where you have to make “pull request to” from your “forked repository feature branch”.

Git commands for branches — cheatsheet:

How to work with branches?

Figure 4: Merging in branches

Creation:

  1. A branch is created from another branch.
  2. When a branch is created it shares the past history(previous commits) with the branch it is created from. To be more precise the history until the last commit before the creation.

Checkout:

  1. It means going to/visit another branch.
  2. In this operation, the head is shifted from current branch/commit to another branch/commit. Like from dev/23dve2 to bug_fix/23dve2.

Divergence of branches:

  1. It means the branches being compared have a different history.
  2. In figure 4 — part 2 we can see “dev” and “new_feature” have varying commits no we can say them as diverged.

Merging:

So different people and features have their own branches to commit code. If that’s the case then, how isolated features will end up as “one piece” in the final production code?

The answer is we combine the branches together to get the final code. When we combine the branches it is called a merge.

If commits are to merged “from B” => “to A”, then:

  1. If both A and B have any new commits — The result is 3 way merge. This means in addition to adding commits from B to A, on top of that a merge commit will added. figure 4 — part 2 : “new_feature” is merged with “dev”.
  2. If one of A or B don't have any new commits — The result is a fast forward merge. No merge commit this time. figure 4 — part 3: “bug_fix” is merged with “dev”

Merge conflicts:

When other contributors commits in “branch A” change the same part of code as your commits in “branch B”, the git won't know how to perform the merge. Thus results in a merge conflict. We will go into detail about how to solve merge conflicts in the upcoming article.

Forks and branches are cool, now let us see how everything comes together!

Figure 5: each circle is a commit

When you develop software in open-source you are required to manage 3 git trees. You may think that’s quite a lot, actually, it is pretty simple! (also remember that the fellow members of the org are willing to help you). Let's break down the complexity into small consumable chunks.

  1. upstream remote — it is the original repo of the project you are contributing to.
  2. origin remote — it is the forked repo.
  3. local — this is the cloned git tree on you local machine

Project Setup

  1. Fork the repo
  2. clone the forked repo
  3. add the upstream remote to the existing project

why we need to manage 3 trees?

  1. We need a git tree locally for committing code. The open-source orgs don't provide direct access to push code into their original repo. So 3 trees are evident!
  2. Other people will be making PR to upstream. And the merged PR will result in updated “upstream/dev”.
  3. Now the issue is how will we get the updated code to “local” and “origin”.
  4. The issue becomes more complicated when it has merge conflicts with your code.

How to get the updated code to the local and fork(origin)?

The answer is quite simple, most probably the updated code will be placed in “upstream/dev”. All you need to do is checkout to “dev” in the local git. And pull from “upstream/dev”. After that, you can send the same code to your fork (origin).

What if you are already working on a feature in your personal branch?

Pull the “upstream/dev” into “personal_branch”. Beware this will create a three way merge commit.

Please follow our publication for more git ninja training. Our next article will be on merge conflicts and how to handle them.

ding ding!!!

--

--

Naveen Sundar
NITRR Open Source

Full Stack Developer. Open Source Contributor. Space Enthusiast. Viva la SpaceX.