Working on Software with Git
Enable better software development in team with Git.
This article is written as a part of Individual Review competency for Software Projects course 2020 at Faculty of Computer Science, University of Indonesia.
Introduction
Source code is a precious resource for every software developer. It is so precious that must be protected. Working in team is very common in software construction. Hence, every software team needs something that can manage the integrity of the source code throughout the software lifetime and a guaranteed way to revert to the working state once a mistake happened. That’s where Version Control System comes to help.
Version Control
What is it?
Version control system (VCS) is a tool that manages changes to our source code. It can track all the changes that have been made to every “revision”. With this capability, developer team can turn back to earlier revision once a mistake is made to newer revisions, while also minimize any disruption made to other team member’s works. VCS also tracks every changes made to a particular file. In detail, it tracks every individual change by each contributor (the person who made the changes). Sometimes there would be a case where changes made by a team member would be incompatible with those that were made by another team member. VCS provides a way to detect this problem early and quickly. This in turn makes building software in team easier.
Benefits
Version Control Systems are widely used because they enable developer teams to work concurrently in an efficient manner. Whether it’s building a small or large-scale software, Version Control Systems provide these benefits:
- Long-term history — Every changes made by team members will be kept. We don’t have to worry about any change we make because VCS will record all of them, whether it’s creating a file, deleting a file, or even change a character in a line. The history that will be kept includes information regarding the person who made the change, when the change was made, and even written notes about that said change. Those information enable easy “rewind” to previous revisions. This is particularly helpful when dealing with bugs.
- Branching and merging — Simply said, branching lets every team member works on their own “workspace”. Every branch will have their own histories and will be independent from other branches. This is particularly helpful to maintain a constant streams of independent works. Usually, individual branches will be “merged” into a single branch to verify the integrity of the software being made as a whole. Branches can be made per feature, release, or even both. How teams practice branching depends on the workflow that they adopt.
- Traceability — Each change made is traceable, meaning that they can be annotated with a message describing the purpose and intent of the change. It’s very useful to help team members understand how and why the change was made. In addition, the changes can be connected to project management and bug tracking software, thus making analysis easier.
Introducing Git
Many Version Control Systems are available, such as Git, Apache Subversion, AWS CodeCommit, Mercurial, Bitbucket, and more. One of the most widely used ones is Git. It is a distributed VCS that is designed for coordinating work among team members.
The key term here is “distributed”. Different than other centralized VCS such as Subversion, each team member gets their own local workspace or repository, complete with a full history of changes. Contrast with centralized VCS that makes each team member to have a working copy that points back to a single central repository. This brings a benefit of not having to need a network connection to creat commits (changes), inspect previous versions of a file, or perform inspection between commits. Another benefit is that Git completely removes a blocking condition where a broken production branch halts the progress of every team member (this could happen in Subversion).
Git Essential Commands
Git has plethora of commands, each with their own variation of parameters. This section will list ten commands that are essential to know for every developers using Git.
There will be two terms introduced here: local repository that indicates a repository that resides on a developer’s machine/computer and remote repository that indicates an “uploaded” repository to an online Git platform (GitLab or GitHub). Usually, both repositories contain the same codebase and will be synced throughout the development process.
Remote
This command is used to manage remote repositories. A local Git repository can contain more than one remote repositories. The repositories that are managed will be tracked (we can fetch and give changes to them).
To see a list of managed remote repositories:
git remote
git remote -v# Adding -v will displays a more detailed (verbose) information
To add a new remote repository:
git remote add <remote_name> <remote_repository_url>
To rename an existing remote repository:
git remote rename <old_remote_name> <new_remote_name># Provided old_remote_name already exists
To remove an existing remote repository:
git remote remove <remote_name># Provided remote_name already exists
To change the URL of an existing remote repository:
git remote set-url <remote_name> <new_url># Provided remote_name already exists
Pull
A “pull” means updating and integrating (by merging) our local repository with another repository. The “pulled” repository does not necessary have to be remote repository as it can be another branch of the same local repository.
Typical usage:
git pull
git pull origin# The two commands are equivalent
We can also specify which branch to be “pulled” from, e.g:
git pull origin <branch_name>
Push
A “push” means updating and integrating our remote repository with our local repository. In plain English, it’s like “uploading” our changes from local to remote repository. To be able to use this command, we first need to set up the remote
location which points to the remote repository (e.g. a GitHub or GitLab repository) by remote
command (explained above).
Typical usage:
git push <remote_name> <branch_name>
The above command will push our changes to a remote repository designated by remote_name
to the branch designated by branch_name
.
Clone
This command will clone an existing remote repository into a new directory. “Cloning” means transferring or “downloading” all the files to the designated directory. It also creates a remote named “origin” that points to the branch from which the clone was performed on.
Typical usage:
git clone <repository_url> [<directory_to_clone>]
Merge
This process joins two or more development histories together. In practice, it is usually used to integrate changes from a specific branch.
Typical usage to incorporate changes from a specific branch to the branch where the command is performed on:
git merge <branch_name># Merge changes from branch_name to the current branch
To merge two branches on top of the current branch:
git merge <branch_name_1> <branch_name_2>
Usually, a merge process will end in a creation of a “merge commit”; that is, a commit that contains the merging process between the branches.
Merge command has many use cases with many variations and behaviours. For more detailed information, please refer to the Git reference on merge
command here.
Rebase
Similar to merge, it is also used to integrate changes between branches. But there are differences:
merge
will only affect the destination branch history while the source branch will remain untouched. In contrast,rebase
will move all the histories from the source branch into the end of destination branch history; this, in turn affects the source branch history.merge
adds one new commit to the destination branch (that details the commit), whilerebase
will re-write the project history by creating new commits for each commit in the source branch.
Typical usage:
git rebase <branch_name>
The above command will move the entire current branch to begin on the tip of the branch_name
branch.
Revert
As it name stands, it is used to revert some existing commits or changes.
If we want to revert changes from a specific commit, we will need to know the commit’s unique hash. Every commit has a unique hash that is used to identify them.
To revert changes from a specific commit:
git revert <commit_hash>
The above command will create a new commit that records the revert process once the reverting is done. To avoid this, we can pass a --no-commit
argument.
git revert --no-commit <commit_hash>
Stash
“Stashing” mean storing our uncommited changes to a separate storage and return the state of the working directory to match the latest or HEAD
commit.
To “stash” changes:
git stash
To see a list of stash:
git stash list
To inspect an existing stash:
git stash <stash_name># Provided stash_name is available
To apply a stash on top of the current working directory:
git stash apply <stash_name>
To apply a stash on top of the current working directory and remove said stash from the stash list:
git stash pop <stash_name>
To remove a single stash from the list of stash:
git stash drop <stash_name>
To remove all the stash entries:
git stash clear
Checkout
This command is typically used to switch branches.
To create a new branch:
git checkout -b <new_branch_name>
To switch to an existing branch:
git checkout <branch_name># Provided branch with name branch_name exists
Branch
This command is used to list, create, or delete branches.
To list all existing branches:
git branch --list
To create a new branch:
git branch <new_branch_name># It is equivalent to "git checkout -b <new_branch_name>
To delete a branch:
git branch --delete <existing_branch_name># Provided existing_branch_name exists
Besides the above ten commands, Git offers many other commands that are suited for specific use cases. For detailed information, please refer to the Git reference here.
Extra: Git Implementation in Software Projects Course 2020
This section will discuss on how Git is being implemented (at the time of the writing) by the author and team in Software Projects Course 2020 of Faculty of Computer Science, University of Indonesia.
We use our faculty self-hosted GitLab for the remote Git platform that stores all our project repositories.
For the workflow, we use a Git Flow that is adjusted for the course, as shown in the illustration below.
To start, there are many branches that will be used throughout the project lifetime:
- Master — the one branch that stores the source code in production, deploy-ready state. Any commits made to this branch are limited to merge commits from staging branch only.
- Staging — stores the source code that will be used in “staging” environment. Commits made to this branch are usually limited to merge commits from the PBI branches.
- PBI[1…n] — store the implementation of the corresponding Product Backlog Item of a Sprint.
- Hotfix — a branch from master that is used for bug fix. This branch is optional as it is situational.
- Coldfix — whenever one or more Product Backlog Items are rejected, this branch is used to revert the changes related to the rejected PBIs.
In short, here is how we implement the Git Flow:
- Master and staging branches initialization.
- PBI branches initialization, then implement the product requirements to said branches.
- While implementing the features, we are encouraged to perform Test-Driven Development paradigm.
- Before a Sprint Review, the PBI branches will be merged to staging. During the event, staging branch will be used to present the product to the client.
- After the Sprint Review:
- If there are PBIs that get rejected, we perform rollback on coldfix branch, then merge it to staging.
- If there are no more issues, merge the staging branch to master branch.
Conclusion
Version Control System like Git is a must-to-have tool when we are constructing a software, more so in collaboration. With Git, any changes can be recorded and whenever a disaster happens, the team can confidently rollback to the last known “safe” state. Git also enables branching that promotes constant independent works and merging that enables two or more branches to be integrated onto one branch. All in all, Git can help developers to do effective software development.
This wraps up the article. If you have any suggestions, don’t hesitate to comment :)
References
PPL 2020 Git Flow Reference