Git commit -m “welcome to git world”
This article aims at explaining from scratch what is git and how it works.
What is git and what is it for?
Git is a free, open-source, distributed (as opposed to centralized) version control system used to manage source code. Git tracks changes and historic in the source code and enables several developers to work together. It was created by Linus Torvalds in 2005 for the Linux kernel development.
Distributed vs Centralized system
Centralized version control systems (CVCS): the source code is stored in a central server maintained by a single authority. Developers check out the code needed to work on and make changes directly to that codebase. Once changes are made, the code is commited back to the central server.
Distributed version control systems (DVCS): the source code is still stored in a central server, but developers can create local copies of the code repository, with the complete history.
How git works?
Git states
As a distributed version control system, git has a remote repository. We can clone or fetch the repository to our own local repository. Inside our local environment, we have the following states:
- Modified: means that you have changed the file in your working directory but have not committed it to your database yet.
- Staged: means that you have marked a modified file in its current version to go into your next commit snapshot into the staging area.
- Committed: means that the data is safely stored in your local database (local repository).
It is possible to get changes from your local repository to your working repository by git checkout command and git push command allows to push changes from your local repository to the remote repository.
Git workflow
The following picture explained git flow with the appropriate git commands to go from one state to another.
Commands
Git has some commands where main of them are explained in the following picture. Let’s review most of them:
Setting up a repository
- git init: creates a new Git repository
git init
- git clone: clones a repository into a new local repository
git clone https://github.com/libgit2/libgit2
- git config: set configuration options for the Git installation
git config --global user.name "John Doe"
git config --global user.email johndoe@example.com
git config --list
=> will list all the git settings
Saving changes
- git add: adds a change in the working directory to the staging area
git add filename1, ./directory2/*
- git commit: records changes to the local repository
git commit -m "relevent commit message"
git commit --amend -m "better commit message"
=> --amend option allows to not create a new commit but modifying the existing one
- git stash: temporarily save (or stashes) changes made to your working copy so you can work on something else, and then come back and reapply them later on
git stash
=> takes the uncommitted changes (both staged and unstaged), saves them away for later use, and then reverts them from the local working copy
git stash pop
=> reapply previously stashed changes
git stash apply
=> reapply the changes to your working copy and keep them in your stash
Undoing changes
- git clean: removes untracked files from the working directory (logical counterpart to git reset)
git clean -n
=> Performs a dry run operation
git clean -d ./currentPath
=> performs a clean recursively from the currentPath specified
- git reset: undoes changes to files in the working repository
git reset 9a9add8 => will go to commit hash 9a9add8
- git revert: undoes a committed snapshot
git revert --no-edit eae84e7
=> Will revert commit hash eae84e7 targeted
=> With this no-edit option, git revert will not start the commit message editor.
Git revert is a command used to create a new commit that undoes the changes introduced by a specific commit. Unlike git reset, which modifies the project history, git revert creates a new commit that effectively cancels out the changes made in the target commit.
Note: HEAD is defined later in the document in the To go further section
Syncing and using branches
- git push: uploads local repository content to a remote repository
git push <remote> <branch>
git push origin main
=> In Git, "origin" is a shorthand name for the remote repository that a project was originally cloned from.
More precisely, it is used instead of that original repository's URL
and thereby makes referencing much easier.
- git fetch: downloads commits, files, and refs from a remote repository into your local repository
git fetch <remote> <branch>
- git merge: joins together the different workpaths into single workpath
git merge <branch>
=> Will merge specified branch name into the current branch
- git pull: fetch and download content from a remote repository and immediately update the local repository to match that content
git pull <remote> <branch>
- git checkout: changes or switch branches
git checkout feature2
- git branch: allows to create, list, rename, and delete branches
git branch feature1 => creates feature1 branch
git branch -d feature1 => deletes feature1 branch
git branch -m new-branch-name => rename current branch by new-branch-name
git branch -m old-branch new-branch => rename old-branch to new-branch
git branch => will list all of your branches
Branches
Git branches are effectively a pointer to a snapshot of your changes. A branch represents an independent line of development. That’s why different strategies can apply when we want to merge the different lines of development together.
The default branch is called main
(or master
).
Branches strategies
There are different sort of branches strategies that can be applied in a project. Some are quit simple, other can be more complex to put in place.
Here are the most popular git branch stategies:
- No flow: everyone just commits directly to main (or master) without using branches
- Trunk-based development: all branches other than the main branch are short-lived and merged in within a defined timeframe. Large features are built incrementally and hidden behind feature flags
- Feature branching: commonly used workflow that involves creating a new branch for a specific feature or change in the codebase. This allows developers to work on the feature independently without affecting the
main
branch
- GitHub flow: every unit of work, whether it be a bugfix or feature, is done through a branch that is created from master. The main branch is always production-ready, and changes to this branch trigger the CI/CD process (this isn’t the case with feature branching)
- GitLab flow: branches are created for features, releases and envronments. The main branch is always production-ready
- Git flow: a branching strategy that uses two main long-lived branches (
main
anddevelop
) that remain in the project during its entire lifetime. Additionally, there are short-lived branches (feature
,release
,hotfix
) that are created as needed to manage the development process and deleted once merged intomain
ordevelop
.
Detailed branches in git flow:
main
branch is the stable production-ready codedevelop
branch is where all development takes placeFeature
branches are used to develop new features or changesrelease
branches are used to prepare for a new releasehotfix
branches are used to quickly fix critical production issues
These different strategies have pros and cons, it was presented from the easiest to the most complext one.
Apply branch strategy with git merge and git rebase
Once chosing a branch strategy, one must know how to merge features and how to resolve conflicts. This is where git merge and git rebase commands cames in. Let’s review them:
- git merge: allows developers to merge Git branches while the logs of commits on branches remain intact
git merge <branch>
- git rebase: lets users integrate changes from one branch to another, and the logs are modified once the action is complete
git rebase <targetBranch>
=> will automatically take the commits in the current working branch
and apply them to the head of the passed branch.
Note: In the To go further session, we go deeper in the different ways to rebase
The main difference between git merge and git rebase is that git merge is a way of combining changes from one branch (source branch) into another branch (target branch) where as git rebase is a way of moving the changes from one branch onto another branch.
The following comparative table shows the differences between git merge and git rebase:
Other useful commands: review changes
- git status: displays the state of the working directory and the staged snapshot
git status
- git diff: multi-use Git command that when executed runs a diff function on Git data sources (commits, branches, files and more)
git diff
- git log: lets you explore the previous revisions of a project
git log -n 3
=> will show commit tree with a depth of 3
- git tag: tags are ref’s that point to specific points in Git history. Tagging is generally used to capture a point in history that is used for a marked version release (i.e. v1.0.1)
git tag <tagname>
git push origin --tags
=> tags are not including by default when pushing to the remote repository
- git blame: the display of author metadata attached to specific committed lines in a file. This is used to examine specific points of a file’s history and get context as to who the last author was that modified the line
git blame <file>
To go further
HEAD pointer with ^
and ~
In Git, HEAD is a special pointer/reference that always points to the latest commit in the current branch. When making a new commit, HEAD move forward to point to that new commit.
The characters ^
and ~
are used to make the pointer navigate into previous commits. In short, the ^
and ~
symbols allows to traverse backwards in the project’s history, while the numerical value allows to precisely determine the wished number of commits to go back.
Here are some examples to illustrate the differences between ^
and ~
with a commit tree:
G H I J
\ / \ /
D E F
\ | / \
\ | / |
\|/ |
B C
\ /
\ /
A
A = = A^0
B = A^ = A^1 = A~1
C = A^2
D = A^^ = A^1^1 = A~2
E = B^2 = A^^2
F = B^3 = A^^3
G = A^^^ = A^1^1^1 = A~3
H = D^2 = B^^2 = A^^^2 = A~2^2
I = F^ = B^3^ = A^^3^
J = F^2 = B^3^2 = A^^3^2
- Use
~
to go back a number of generations in a linear way - Use
^
on merge commits because there are at least 2 immediate parents
Soft reset vs Hard reset vs Mixed reset
Soft reset
git reset --soft HEAD^
The soft reset enables backtracking on the last commit while perserving all the changes in the staging area. It is still possible to uncommit while retaining the code changes using this command. To be used when revising the last commit is needed, perhaps to add more changes before committing again.
Mixed reset
git reset --mixed HEAD^
The mixed reset is the reset by default. It un-commits the last commit and removes its changes from the staging area. The changes are kept in the local working repository. It is helpful when the last commit should be un-commited while keeping the changes in the local working directory before re-committing.
Hard reset
git reset --hard HEAD^
It completely erases the last commit along with all the associated changes from your Git history. When you use --hard
flag, there's no going back.
Git rebase
There are basically 3 types of rebase. Let’s review them:
Git rebase
Will rebase the current branch, referenced by HEAD
, on top of the latest commit that is reachable from <branch>
but not from HEAD
git rebase <branch>
Before After
A---B---C---F---G (branch) A---B---C---F---G (branch)
\ \
D---E (HEAD) D---E (HEAD)
In this example, F
and G
are commits that are reachable from branch
but not from HEAD
. Saying git rebase branch
will take D
, that is the first commit after the branching point, and rebase it (i.e. change its parent) on top of the latest commit reachable from branch
but not from HEAD
, that is G
.
Git rebase — onto +2 arguments
Enables to rebase starting from a specific commit. It grants exact control over what is being rebased and where.
git rebase --onto <newparent> <oldparent>
Before After
A---B---C---F---G (branch) A---B---C---F---G (branch)
\ \
D---E---H---I (HEAD) E---H---I (HEAD)
In this example, we need to rebase HEAD
precisely on top of F
starting from E
. F
should be bringed into our working branch while, at the same time, D
has to be deleted because it contains some incompatible changes.
In this case, we would say git rebase --onto F D
, meaning rebase the commit reachable from HEAD
whose parent is D
on top of F
.
Before After
A---B---C---E---F (HEAD) A---B---F (HEAD)
This is a specific use case where the git rebase --onto B E
can be useful. It will rebase HEAD
on top of B
where the old parent was E
in order to remove C
and E
from the commit sequence.
Git rebase — onto +3 arguments
This rebase method can go one step further in terms of precision. It enables to rebase an arbitrary range of commits on top of another one.
git rebase --onto <newparent> <oldparent> <until>
Before After
A---B---C---F---G (branch) A---B---C---F---G (branch)
\ \
D---E---H---I (HEAD) E---H (HEAD)
In this example, we want to rebase the exact range E---H
on top of F
, ignoring where HEAD
is currently pointing to. It can be done by saying git rebase --onto F D H
, meaning: rebase the range of commits whose parent is D
up to and including H
on top of F
.
The trick here is to remember that the commit referenced by <until>
is included in the range and will become the new HEAD
after the rebase is completed.
Cherry-Pick
Cherry-picking in Git means choosing a commit from one branch and applying it to another.
Cherry-picking contrasts with other ways such as merge
and rebase
which normally apply many commits to another branch.
Although it is possible to cherry-pick several commits, it’s a better practice to use merge in this case.
The following git command will perform a cherry-pick. Make sure to apply this command when you’re on the branch you want to apply the commit
git cherry-pick <commit-hash>
The cherry-pick should be applied in specific scenarii like:
- Team collaboration between backend and frontend developers: when the backend creates a new data-structure, needed from the frontend part. In this case the cherry-pick would enable the frontend developer to carry on its part while the backend developer is still coding
- Bug hotfixes: when a bug is discovered and you want to deliver as soon as possible a fix. The fix can be handle in one commit, and instantly cherry-pick on the branch used for deliveries
Miscellaneous
- git alias: manages shortcuts for git commands
git config --global alias.co checkout
git config --global alias.br branch
git config --global alias.ci commit
git config --global alias.st status
- git hooks: scripts that are executed automatically whenever a specific event occurs in a Git repository.
Thanks for reading this article. Hoping it was quite a complete guide to start with git, as much for git beginners as for more senior developers.
Feel free to comment this article, by telling for instance which git command are you using the most or what branch strategy you’re using in your project.