Git all the things
This post will cover a high level overview of Git. It will touch on the background of Git, the basic commands that developers use every day, and a brief look into more advanced topics like branching and rewriting history. Enjoy. :)
What is Git?
Before we dive into Git, let’s talk about what Version Control is. Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later.
A Version Control System (VCS) allows you to revert files back to a previous state, revert the entire project back to a previous state, compare changes over time, see who last modified something that might be causing a problem, who introduced an issue and when, and more. Using a VCS also generally means that if you screw things up or lose files, you can easily recover them.
In a Distributed Version Control Systems (DVCSs) such as Git, clients don’t just check out the latest snapshot of the files: they fully mirror the repository. So if any server dies any of the client repositories can be copied back up to the server to restore it. Every clone is really a full backup of all the data.
On top of that, many of these systems have several remote repositories they can work with, which makes collaborating with different groups of people in different ways at the same time on the same project much easier.
Since its birth in 2005, Git has evolved and matured to be easy to use and yet retain these initial qualities. It’s incredibly fast, it’s very efficient with large projects, and it has an incredible branching system for non-linear development.
So what is Git in a nutshell? Git stores and thinks about information much differently than other version control systems.
Git does NOT think of the information they keep as a set of files and the changes made to each file over time.
Git think of its data more like a set of snapshots of a miniature filesystem. Every time you commit, or save the state of your project in Git, it basically takes a picture of what all your files look like at that moment and stores a reference to that snapshot. To be efficient, if files have not changed, Git doesn’t store the file again, just a link to the previous identical file it has already stored. Git thinks about data more like a stream of snapshots.
The Three States
Git has three main states that your files can reside in: committed modified, and staged. Committed means that the data is safely stored in your local database. Modified means that you have changed the file but have not committed it to your database yet. Staged means that you have marked a modified file in its current version to go into your next commit snapshot.
This leads us to the three main sections of a Git project: the Git directory, the working directory, and the staging area.
The basic Git workflow goes something like this:
- You modify files in your working directory.
- You stage the files, adding snapshots of them to your staging area.
- You do a commit, which takes the files as they are in the staging area and stores that snapshot permanently to your Git directory.
The first thing you should do when you install Git is to set your user name and email address. Every Git commit uses this information:
git config — global user.name “Keala Lusk”
git config — global user.email email@example.com
If you want to check to make sure your infomration is correct, use `git config — list` or `git config <key>`:
git config user.name
If you need help while using Git, there are three ways to get the manual page (manpage) help for any of the Git commands:
git help <verb>
git <verb> — help
If you’re starting to track an existing project in Git, you need to go to the project’s directory and type:
This creates a new subdirectory named .git that contains all of your repository files — a Git repository skeleton. At this point, nothing in your project is tracked yet. You can start tracking changes with a few git add commands that specify the files you want to track, followed by a git commit:
git add -A
git commit -m ‘initial commit’
It is extremely important that you commit often. This allows you to easily revert your changes if you see something broken, or make a quick change to a small diff. It’s also beneficial for code reviews. The reviewer is more likely to catch something if they are looking at a 10 line diff, than a 500 line diff.
The main tool you use to determine which files are in which state is the git status command. This command also tells you which branch you’re on and informs you that it has not diverged from the same branch on the server.
If you run `git status — short` or `git status -s`, this will show you the status of your files in a more compact way.
Most of the time you will have files that you don’t want Git to automatically add or track. Setting up a .gitignore file before you get going is best practice so you don’t accidentally commit files that you don’t want in your Git repository. The Pro Git book, written by Scott Chacon and Ben Straub, discusses the rules for gitignore files:
- Blank lines or lines starting with # are ignored.
- Standard glob patterns work.
- You can start patterns with a forward slash (/) to avoid recursivity.
- You can end patterns with a forward slash (/) to specify a directory.
- You can negate a pattern by starting it with an exclamation point (!).
Glob patterns are like simplified regular expressions that shells use. An asterisk (*) matches zero or more characters; [abc] matches any character inside the brackets (in this case a, b, or c); a question mark (?) matches a single character; and brackets enclosing characters separated by a hyphen ([0–9]) matches any character between them (in this case 0 through 9). You can also use two asterisks to match nested directories; a/**/z would match a/z, a/b/z, a/b/c/z, and so on.
This command is helpful for seeing what you have changed but not yet staged. It compares what is in your working directory with what is in your staging area.
To see the changes you have staged that will go into your next commit, you can use `git diff — staged`.
The best way to think of a branch is as an independent line of development. They abstract the edit/stage/commit process.
Branching is a way of continuing to work on a feature without messing up another line of development, like master. You can easily switching back and forth between branches, making development a pretty fast process.
Running `git branch` will list all of the branches in your repository.
git branch -d <branch>
This command says to delete the specified branch. It prevents you from deleting the branch if it has unmerged changes.
git branch -D <branch>
This command tells Git to force delete the specified branch, even if it has unmerged changes.
git branch -m <branch>
This will rename the current branch to <branch>.
The git checkout command lets you navigate between the branches created by git branch.
git checkout <existing-branch>
This checks out the specified branch, which should have already been created with git branch. It makes <existing-branch> the current branch, and updates the working directory to match.
git checkout -b <new-branch>
Create and check out <new-branch>. The -b option is a convenience flag that tells Git to run git branch <new-branch> before running git checkout <new-branch>.
git checkout -b <new-branch> <existing-branch>
Same as the above invocation, but base the new branch off of <existing-branch> instead of the current branch.
Anything that is committed in Git can almost always be recovered. Its main job is to make sure you never lose a committed change. So, commit often.
git reset HEAD <filename>
Let’s say you’ve changed two files and want to commit them as two separate changes, but you accidentally ran `git add *` and stage them both. How can you unstage one of the two?
Running this command will leave the <filename> file modified but once again unstaged.
git checkout — <filename>
If you realize that you don’t want to keep your changes to a file, running this command will revert the changes.
git commit — amend
If you run this when nothing is staged, it can be used to edit the previous commit message. This is the one I use most frequently, due to my mindless typing and spelling error, but it can be used to combine staged changes with the previous commit instead of committing it as an entirely new snapshot.
`git commit — amend’ seems to be the least scary of the history rewriting commands, because if you screw anything up with this one it’ll only be your most recent changes. Only. Git sees amending it as a brand new commit, so it replaces the most recent commit entirely. This is useful when you commit prematurely, or forget to stage a file.
Be careful with this command though, when you `amend` a commit, the previous commit is removed from your project history.
When you need to need to mov a branch to a new base commit, rebasing is what you need. It rewrites your project history by creating new commits and putting your commits on top of that.
Rebasing can be helpful when your branch is behind the master branch. It is like saying, “I want to base my changes on what everybody has already done.”
git rebase -i
When you run the command with the -i flag, it begins an interactive rebasing session. Instead of moving all of the commits to the new base, this lets you alter individual commits in the process by removing, splitting, and alter an existing series of commits.
I use an interactive rebase to polish a feature branch before adding it into the main code base. This lets me squash insignificant commits, delete obsolete ones, and make sure everything else is in order before committing to the “official” project history.
Git keeps track of updates to the tip of branches using a mechanism called reflog. After rewriting history, the reflog contains information about the old state of branches and allows you to go back to that state if necessary.
Every time the current HEAD gets updated by switching branches, pulling in new changes, rewriting history or simply by adding new commits, a new entry will be added to the reflog.
Cherry picking in git means to choose a commit from one branch and apply it onto another. This is in contrast with other ways such as merge and rebase which normally applies many commits onto a another branch.
You need to be on the branch you want apply the commit to, then run `git checkout master`, and `git cherry-pick <commit-hash>`.
The basics of Git, branching, and undoing things are only a few concepts I have found to be particularly useful in my development experience, and it doesn’t even begin to skim the surface of the Git world.
I hope this high level overview of Git has been helpful for you, because researching Git has definitely been helpful for me. I used a variety of resources to spin this article up, namely Pro Git, Atlassian’s Git Tutorial, and StackOverflow. I couldn’t have done it without them.
Thanks for reading!
If you enjoyed this article, please ❤ it and share it with others!