Forks and Branches for open-source
Git Ninja Training
This story is part 4 of Open Source 101 Series
Previous story: Git for Open-Source
If you are a beginner to git I highly recommend you to go through our previous story where we have laid down the basics. It covers repository, remote, commit, push, pull and clone.
In this story we will explore forks and branches, fine tuned for open-source.
Fork — creating your own copy of a repository
Open source orgs dont allow people to directly commit code to their repos for security concerns. The contributions are welcome traditionally via pull requests. The contributor uses the forked repo to commit code, which later becomes a pull request.
Branches — are independent lines of development.
The branches mainly serve two purposes. Those are, provide separate lines of development, for
- different contributors — Imagine your full day of work being messed up by another contributor’s commit!
- different features — it acts an isolated place to develop a feature or fix a bug or store a version of the software etc.,
A repository can have one or more than one branches. The default branch is called “master”.
Usually open-source orgs have a branch called “dev” (development). This is the branch where you have to make “pull request to” from your “forked repository feature branch”.
Git commands for branches — cheatsheet:
create new_branch from old_branch
git checkout -b new_branch old_branch
Ex: git checkout -b login_ui devcheck current branch name
git branchvisit other_branch
git checkout other_branch
Ex: git checkout signup_uipush new_branch to some_remote
git push -u some_remote new_branch
Ex: git push -u origin login_uipull some_branch from some_remote into current_branch
1. git checkout current_branch
2. git pull some_remote some_branch
Ex:
git checkout dev
git pull upstream devdelete some_branch locally
git branch -D some_branch
Ex: git branch -D login_uidelete some_branch in some_remote
git push --delete some_remote some_branch
Ex: git push --delete origin login_uimerge second_branch into first_branch
1. git checkout first_branch
2. git merge second_branch
Ex:
git checkout dev
git merge bug_fixadd upstream remote to the existing project
git remote add upstream <original-project-remote-url>
How to work with branches?
Creation:
- A branch is created from another branch.
- When a branch is created it shares the past history(previous commits) with the branch it is created from. To be more precise the history until the last commit before the creation.
Checkout:
- It means going to/visit another branch.
- In this operation, the head is shifted from current branch/commit to another branch/commit. Like from dev/23dve2 to bug_fix/23dve2.
Divergence of branches:
- It means the branches being compared have a different history.
- In figure 4 — part 2 we can see “dev” and “new_feature” have varying commits no we can say them as diverged.
Merging:
So different people and features have their own branches to commit code. If that’s the case then, how isolated features will end up as “one piece” in the final production code?
The answer is we combine the branches together to get the final code. When we combine the branches it is called a merge.
If commits are to merged “from B” => “to A”, then:
- If both A and B have any new commits — The result is 3 way merge. This means in addition to adding commits from B to A, on top of that a merge commit will added. figure 4 — part 2 : “new_feature” is merged with “dev”.
- If one of A or B don't have any new commits — The result is a fast forward merge. No merge commit this time. figure 4 — part 3: “bug_fix” is merged with “dev”
Merge conflicts:
When other contributors commits in “branch A” change the same part of code as your commits in “branch B”, the git won't know how to perform the merge. Thus results in a merge conflict. We will go into detail about how to solve merge conflicts in the upcoming article.
Forks and branches are cool, now let us see how everything comes together!
When you develop software in open-source you are required to manage 3 git trees. You may think that’s quite a lot, actually, it is pretty simple! (also remember that the fellow members of the org are willing to help you). Let's break down the complexity into small consumable chunks.
- upstream remote — it is the original repo of the project you are contributing to.
- origin remote — it is the forked repo.
- local — this is the cloned git tree on you local machine
Project Setup
- Fork the repo
- clone the forked repo
- add the upstream remote to the existing project
why we need to manage 3 trees?
- We need a git tree locally for committing code. The open-source orgs don't provide direct access to push code into their original repo. So 3 trees are evident!
- Other people will be making PR to upstream. And the merged PR will result in updated “upstream/dev”.
- Now the issue is how will we get the updated code to “local” and “origin”.
- The issue becomes more complicated when it has merge conflicts with your code.
How to get the updated code to the local and fork(origin)?
The answer is quite simple, most probably the updated code will be placed in “upstream/dev”. All you need to do is checkout to “dev” in the local git. And pull from “upstream/dev”. After that, you can send the same code to your fork (origin).
git checkout dev
git pull upstream dev
git push origin dev
What if you are already working on a feature in your personal branch?
Pull the “upstream/dev” into “personal_branch”. Beware this will create a three way merge commit.
git checkout personal_branch
git pull upstream dev