What is the main idea behind git merge and rebase commands, and how it works behind the scene. All you need to know for being confident in using it.
In Git, there are two main ways to integrate changes from one branch into another: git
merge and the
Before we will talk about differences in
rebase commands, we need to have an understanding of what branches are:
Branch — is just a reference to a commit.
Let’s imagine that we have initialized a new git project with three branches:
If we will use
git cat-file <commit-sha> -p command to each commit, we will see that commits have a parent (not master branch, obviously), and our “branches” are just references to those commits.
git cat-file <SHA> -t // type
git cat-file <SHA> -p // pretty printing
Git puts branches in
/refs folder and sub-directory
Get list of branches command:
Create a new branch command:
git branch third
Because branches are just references to commits, after creating a new branch
second branch, you will see a newly created file
/.git/refs/heads folder that will contain only commit id, nothing more:
You can get information about the current branch from a
HEAD file which is located in
HEAD — is just a reference to a branch, a pointer to a pointer
When you are changing branches using
git checkout branchName command two things will happen:
- HEAD will be moved to the branch pointer;
- Git will replace the files and folders in our working area, the working directory with the files and folders in this commit.
Now, we are on the same page and we can deep dive into the
merge command I will be using a simple project with a text file:
- Initialize GIT project (git init, add , etc. I hope you know how to do that);
SomeFile.txtwith text content in it;
- Commit all in
Our next step will be making small changes, that we would want to merge.
first branch with changing line 4 in
master branch. Changing line 4 in
SomeFile.txt file and commit changes:
And after those changes, I want to merge
first branch (source) to
master (target). Let’s see how it looks on a simple diagram:
Let’s double-check our current position with
git branch command:
Ok, we are currently on a target branch, and now, we can
merge our changes using
git merge first command:
We will get a conflict:
How we can solve those issues? We can:
- abort merge action;
- solve conflicts and merge;
Let’s take a look at how it works in action:
1. Abort merge action
git merge --abort
After executing this command, your files will be changed to the previous state as nothing happened:
2. Solve conflicts and Merge
Our first step should be in manual solving conflicts in our file:
- next, we need to use
git add someFile.txt. This is a way to tell GIT that conflict was solved.
- next, we need to commit out merge changes using
git commitwithout a message flag
-m. Git knows that we are in the middle of the merge. It will create a suitable message automatically.
To approve merge message you can type
:wqin the editor
Let’s investigate what is
merge under the hood:
Merge is a simple commit with one exeption — it has two parent
What will happen if we would want to merge master to first:
You can expect that GIT will do exactly the same as it did with merging
But it would be a real waste of data because we already have a commit “0684” with data we want and with already solved conflicts.
So what will GIT do in this case? Let’s check it out:
GIT will simply move the reference of the
first branch to master commit, which is called “fast-forward”:
HEAD pointing not to branch, to commit → Detached HEAD
- The current branch tracks new commits;
- When you move to another commit, Git updates your working directory;
- Unreachable objects are garbage collected.
Let’s imagine that we have two branches →
We don’t want to merge
master branches together and create a merge commit “6”, we want to put
first on to of the
… we can do that by using:
- checkout to
git rebase mastercommand;
If after rebasing you would want that
master branch referenced to the commit “5” after rebasing, you can :
- checkout to
git rebase firstor
git merge first. It doest matters because GIT will just change master branch reference to commit “5” → this is called “fast-forward” ;
What is happing under the hood?
“GIT object is literally just specific contents of the file…Objects are immutable. “
- git documentation
What does it mean for us? This means that you cannot change the state of a commit (commit — is an object), ones it was created, you can only create a new one.
Imagine you have a commit
master branch, and some commit
"3" in another branch. Commit’s
“3” has a parent → commit with id
Because of the object’s immutability, you cannot change commit’s parent. You can only create a copy of a commit “3” with parent: “2”. Because copied commit “3” has different parent now, it will lead to generating different SHA id (in our simple case → “4”)
Commit with id “3” will be garbage-collected by GIT, soon or later. Git garbage-collects unreachable objects.
Rebase creates new commits and leaves behind existing commits that might get garbage collected
So a rebase history looks cleaner, but it is a lie, in its own way.
One single recommendation: When in doubt, just merge.