On The Road To Professional Web Development | Git — Version Control

Alexander S. Augenstein
16 min readApr 27, 2020

--

*The objective of this article is to prepare for a technical interview. To succeed (1) we must be technically proficient, and (2) we must be capable of communicating our proficiency effectively.

In this post, we’ll first become conversationally fluent on the topic of git. Next we’ll run drills to get comfortable using the tool in practical ways. Finally we’ll finish the discussion with a curated list of (reportedly) common interview questions.

This post is part of a series on becoming a professional web developer — click here to navigate to the table of contents*

Part I: Git — Version Control

How Difficult Is It To Master?

This post explores each topic listed in the official git reference. After laying some groundwork, we’ll dig into slightly more than 100 unique concepts as we practice actually using git.

Official Git Logo

Git Logo by Jason Long is licensed under the Creative Commons Attribution 3.0 Unported License.

Git: What Is It?

Git is a piece of software — one among many flavors of “version control software.” It has many potential uses, but most notably it helps coders collaboratively contribute to a codebase. Git keeps the latest and greatest files in an obvious location, and maintains a detailed history of changes over time (including who did what, when, and why). This makes rolling-back to a previous working version of the code very simple, and for better or worse helps point the finger of blame when something breaks.

What Is Git To Developers?

  • Git is the most popular version control software by a wide margin. It’s open-source and distributed under the terms of the GPLv2 license, requiring that any derivative work is licensed equivalently (a practice known as copyleft licensing).
  • Git is written primarily in C, as well as Shell, Perl, Tcl, and Python.
  • Git supports major operating systems, including Windows, MacOS, Linux, BSD, and Solaris.
  • Git powers popular git clients including GitHub (by Microsoft), GitLab (by GitLab), and BitBucket (by Atlassian).
  • Other well-known VC software exists including Mercurial (similar to git), SVN (most popular 10 years ago), and CVS (still in use but loathed by many, most notably Linus Torvalds).

Where Did Git Come From?

Git was invented in 2005 by Linus Torvalds to assist Linux kernel development efforts. Prior to git, devs had access to a free version of BitKeeper, a then-popular VC software. In 2005, due to allegations of IP theft, BitKeeper pulled their free version from the market. Both git and Mercurial were developed in response to this event. CVS existed, among other free tools, but none were a good fit for what Torvalds wanted.

The term git is derived from British slang with a negative connotation meaning an “unpleasant person”. Just like Linux is indirectly named after its creator, so too is git.

For the past 15 years, it has been officially maintained by Junio Hamano and other open-source contributors.

What If We Didn’t Have Git? How Could We Get By Without Version Control Software?

Note: even though I’ve worked with teams that continue to operate this way, these solutions are fragile and not recommended.

  • Configuration control: different revs of a file can be identified using file-naming conventions or folder hierarchies or both.
  • Collaboration: shared folders let you work on shared files as a team.
  • Access authorization: to prevent people from breaking the shared folder contents, file permissions or folder permissions can be applied by your system administrator.
  • Change collision protection: locking or merging strategies can be manually enforced. Through team communication, one-contributor-at-a-time can be enforced to implement locking. Alternatively, if multiple people want to contribute to the same file simultaneously, one solution would be to allow a team can work on local copies then assign one person to manually merge all changes into the shared master file.
  • Disaster recovery: manual file backups on a routine schedule are always possible. You can store data on removable drives or on servers you trust.
  • Detailed change history: a changelog (as a table in a .docx or .xlsx file) can be manually updated every time a change is applied to a master file.

When Would A Team Avoid Git Or Other Version Control Software?

  • Not everyone knows git offers more than a fancy “Save As”: Everyone on your team knows how to make folders and use “Save As” — it offers none of the protections of VC software but it’s dead simple and doesn’t require special training.
  • Not everyone realizes git plays nicely with other helpful tools: VC software can act as a springboard for additional improvements. Many available tools are designed to tie-in seamlessly, including CI / CD pipeline integration.
  • Not everyone realizes git is for more than just code: There are teams using git to manage entire hardware/software products including design assets. As an example of its flexibility, one of my previous jobs used Microsoft TFS (now Azure DevOps) version control software to track changes to 3D and AutoCAD files.

Why Is It Better To Use Version Control Software?

  • It provides a single source of truth across teams, code, and assets.
  • It speeds up teams who don’t want to waste time looking for the latest version of a file.
  • It enables team collaboration by allowing an unlimited number of people to work on files at once.
  • It is co-requisite to many forms of task automation including automated builds, code reviews, CI / CD.
  • It facilitates identity and access management.
  • It tracks change history by providing traceability for every change ever made — useful for audits and compliance.
  • It can be incredibly useful for disaster recovery (can provide automatic backups, can store backups offsite, can easily rollback changes).

How Does It Work?

A typical git lifecycle looks something like this:

  1. Initialize an empty git repository in a shared location (e.g. a server)
  2. Create any branches needed to support your workflow
  3. Add existing files, if any, to the central repo in the appropriate branch
  4. Clone the central repo on each developer’s local machine
  5. Perform routine work (e.g. devs modify local working copies, branching and merging their contributions appropriately)
  6. That’s it! unless you want to lose your change history, you’ll never delete the central repo

Implicit in the above discussion is that by using git, devs never need to dig around for the latest version of a file, or ask who changed a file, or delay the project when they spill coffee all over their now-irreparable dev laptop. Git is free, yet its benefits are invaluable.

What Are Some Non-Standard Uses Of Git?

  • Authors in the digital age need their content to support various layouts. Separating text content from visual style proved to support this goal, so the use of plaintext editors (as opposed to word processors) has become the norm. To version-control creative work (in the form of text files), git has become a very natural choice for authors.
  • Art can follow a workflow very similar to the work of a dev. As work evolves, some ideas are kept but others crumpled up and tossed in the wastebasket. Just as much as devs, artists benefit from disaster recovery, separating relevant work from old drafts, and keeping tabs on what changes were made over time. Even though it is known that this use-case may be data intensive, it comes as little surprise that many digital artists, designers, and musicians use git in their creative workflow.
  • For similar reasons as above, git has additionally been adopted in marketing, product management, customer support, human resources, and budget management.

What Terminology Comes Up When Discussing Version Control?

  • Add (add = index = stage): a command to stage changes. “Before you can commit code, you need to stage it with git add”
  • Atomic commit: A failsafe mechanism where failures during a commit never result in a partial commit. “In git, commits are atomic. They completed, or they didn’t.”
  • Baseline (baseline = mainline = master = trunk): the main branch. “In git, Master is the name of the main branch by default”
  • Blame: a command to see the last person who edited a line “Find out who made that typo by running git blame”
  • Blob: Binary Large OBject. “Git works with blobs, but better with text files”
  • Branch: a command and a noun meaning a set of commits in an existing repository. “Running git branch lets you create and manipulate branches”
  • Change (change = delta = diff): the changes between commits. “See changes between two versions of a file by diffing them”
  • Change List (CL) (change list = changeset = patch): the set of changes made in a single commit. “Instead of committing a changeset to the central repo, git lets you share entire branches between repos”
  • Changeset (see change list)
  • Checkin: uploading code to the central repo for review or merge. “Git implements checkins via a pull request or a push”
  • Checkout (CO): a command to change the active branch or commit. “If you’re working on multiple branches, each for a different feature, git checkout is how you switch between them”
  • Cherry pick (cherry pick = selective merge): a command to merge individual commits from a branch. “When you want some but not all changes made in an experimental branch, run git cherry-pick”
  • Clone: a command to initialize and fill an empty repo with code and revisions from an existing one. “When you want to grab open-source code from a remote repo on GitHub without forking, run git clone”
  • Commit: a command used for saving changes. “Git doesn’t save changes automatically. You do that using git commit”
  • Complete merge (complete merge): a merge between branches that retains all changes. “If you staged and committed all changes, git merge will implement a complete merge”
  • Config: a configuration file. “gitconfig stores your default configuration, which is important for masters of the art of git”
  • Conflict: when contributions from multiple users can’t be automatically merged. “Git will prompt you if it needs you to manually resolve a conflict”
  • Delta (change = diff = delta): the changes between commits. “Diff two files to see changes between them”
  • Delta compression: encoding a target file with respect to one or more reference files. “Delta compression helps git stores packfiles”
  • Depot (depot = repo = repository): where files’ current and historical data are stored. “When using git, your team relies on a central repo but you’ll each work on local clones or forks of the central repo”
  • Develop (develop = next): a common name for a parallel branch that’s always open. “Some devs prefer a workflow where only stable code lives in Master, so they work from a different branch called Develop or Next by convention”
  • Diff (change = diff = delta): the changes between commits. “See changes between two versions of a file by diffing them”
  • Digital signature: optionally added to commits to help ensure they were made by a trusted source. “Git is cryptographically secure but additionally lets you sign and verify work using GPG”
  • Export: copying the contents of a repo without the history. “To export by this definition, you can git clone with depth 1”
  • Feature: a code modification resulting in new behavior. “It’s considered a workflow best practice to work on a feature in its own branch, referred to as a feature branch”
  • Fetch: a command for getting changes from a remote repo without merging those changes into your local copy of the repo. “Git fetch is a safe way to see what other people contributed before incorporating changes into your working copy”
  • Fork: forking a repo both clones it and adds a special kind of link to the original. “Forks let you make changes and open pull requests to the original repo”
  • Forward integration: merging changes from higher to lower priority branches “You can do this in git by merging changes from a parent branch (likeMaster) into a child branch (like Develop)”
  • Gitk: a builtin repository browser. “Gitk is maintained as an independent project but comes bundled with git for our convenience”
  • Gitweb: like gitk, Gitweb is a builtin repository browser. “To use Gitweb, run git-instaweb”
  • Head (head = tip): a pointer to the active commit. “In git, the head defaults to the most recent commit in a checked out branch. Head can be detached to instead have it point to an earlier commit”
  • Hooks: extend the functionality of git. “Hooks automate tasks like sending email with every commit or enforcing coding standards”
  • Import: copying a local directory tree into a repo for the first time. “This is not a special concept in git. Implement with copy-paste”
  • Index (see add)
  • Init: a command to create a new repo. “Start an empty repo or convert an existing project into a repo with git init”
  • Interleaved deltas: a compression technique used by SCCS version control software. “Git uses delta compression but not interleaved deltas”
  • Label (label = tag): a commit marked for convenience. “Git lets you tag commits”
  • Log: a command to show git logs. “Git log output is verbose by default. Some devs prettify their logs by editing their preferences in gitconfig”
  • Mainline (see baseline)
  • Master (see baseline)
  • Merge: a command to join two or more development histories together. “To include the changes in your branch to Master, use git merge”
  • Next (see develop)
  • Origin: the remote repo a project was cloned from. “When you git clone a repo on GitHub, the URL of the repo you’re cloning is the origin of your clone”
  • Pack (pack = packfile): git saves space by combining objects into a single file called a packfile. “The git gc command lets you manually ask git to pack objects”
  • Packfile (see pack)
  • Parallel: branches with the same lifetime, modified concurrently. “Typical git workflows include parallel branches (like Develop) and non-parallel branches (like Topic)”
  • Patch (see change list)
  • Promote: the act of copying a file from a less controlled location to a more controlled location. “When you git init a previously unversioned directory, you’re promoting it”
  • Pull: a command for copying revisions from a remote repo, initiated by the receiving repo. “In git, pull is implemented by git fetch followed by git merge”
  • Pull Request: submitting changes for review prior to a merge. “This concept is not native to git, but GitHub and other common git clients have this functionality builtin”
  • Push: a command for copying revisions into a remote repo, initiated by the source repo. “Before you push changes, you should always git pull”
  • Rebase: like merge, but rewrites history. “Rebasing instead of merging can be used to clean up your commit history. This is a best practice in some workflows to help you trace bugs later”
  • Remote: a tracked repo that is typically read-write or read-only. “When collaborating, you don’t always need full control over all repos. Remotes let you pull changes contributed by others without giving you the keys to the kingdom”
  • Repo (see depot)
  • Repository (see depot)
  • Resolve: addressing a merge conflict. “Git will prompt you if it needs help resolving a merge conflict”
  • Reverse integration: merging changes from lower to higher priority branches “You can do this in git by merging changes from a child branch (like Develop) into a parent branch (like Master)”
  • Selective merge (see cherry pick)
  • SHA-1 hash: a function that maps any string to a 40-digit hex code. “Git hashes almost everything. It helps git ensure consistency across repos
  • Share: making changes in one file or folder makes the change available in multiple branches at the same time. “Git does not feature sharing out of the box. Common use-cases (managing large binaries) might be resolved with the git LFS extension”
  • Source Control Management (SCM) (SCM = VC = VCS = VCM): a system to record changes to files over time and to recall specific versions later. “Git is version control software”
  • Stage (see add)
  • Stream: a branch with special meaning for a given development workflow “Git branches with semantic meaning (Master, Develop, Topic) can be considered their own streams”
  • Sync (sync = update): merging changes from the central repo into a local working copy. “Git implements sync with the pull command”
  • Tag (see label)
  • Tip (see head)
  • Topic: a common name for a short lived branch. “Topic branches are part of a common workflow. The branch may be named after the feature you are adding, then merged and deleted when done”
  • Tree: a graph or a data structure. “File hierarchies are organized as a tree, as are histories in a git repo”
  • Trunk (see baseline)
  • Update (see sync)
  • Upstream branch: a remote repo branch tracked by your local branch “Git push has a parameter that allows you to specify the upstream branch.”
  • Version Control (VC) (see SCM)
  • Version Control Management (VCM) (see SCM)
  • Version Control System (VCS) (see SCM)
  • Workflow: a shared pattern of collaboration, mutually agreed upon by the dev team in advance. “Some git-related workflows include: Centralized (a.k.a. Basic) Workflow, Feature Branching, Forking Workflow, Gitflow Workflow”
  • Working copy (working copy = working directory): a local copy of the files from a repo. “In git, all changes to the central repo start as changes to a local working copy”
  • Working directory (see working copy)

Part II: Using Git

Part II focuses on using git effectively. We have knowledge of the terminology and motivations for using git, so the question that should be burning in our minds is how exactly these concepts are implemented. There is no substitute for experience, so we will learn-by-doing.

Essential Knowledge

  1. Read this Microsoft article on branching strategies. This helps wrap our heads around why we might use git in certain ways.
  2. Follow this interactive git tutorial. It uses a pseudo-git syntax to highlight how git can help you implement various workflows.
  3. The above tutorial has a ‘sandbox mode’. Use it to replicate the branching strategies from the Microsoft article. This git cheat-sheet may help, keep it open in a separate tab.
  4. Now that you’re comfortable with git workflows, go over katacoda’s tutorials to practice full-fledged git on a live server.
  5. On Vogella’s git tutorial you’ll see commands we haven’t covered yet dealing with files (as opposed to the working tree). Practice these by getting creative with katacoda’s sandbox mode (or installing git).
  6. To fill remaining knowledge gaps, read everything on git’s official website. This seems like a tall order, but you’ll see much of it is old-hat for us now and can be skimmed.
  7. Learn to efficiently version control large files using Git LFS.

Best Practices

  • Follow a workflow: reduce communication overhead by establishing processes for merging branches
  • Work from the latest version: this helps to avoid conflicts
  • Commit often: give yourself ample opportunity to rollback changes
  • Document ‘what’ and ‘why’ in commit logs: leave helpful breadcrumbs for other devs or future-you
  • Use the staging area for review: it exists to give you a chance to optionally check your sanity. So opt-in
  • Push clean work: commit often, but tidy up the code and your local commit history before sharing it with a wider audience

Part III: Git Interview Questions (Categorized)

The goal of Part III is to summarize interview questions from across on the web. Questions have been paraphrased to avoid duplicate entries. Drill on these while practicing for the interview, and don’t hesitate to Google solutions.

On Version Control:

  • discuss benefits of using a version control system
  • compare distributed to centralized version control systems

On Git:

  • discuss what git is
  • list advantages advertised by git
  • list the languages git is written in
  • discuss git performance in terms of speed

On Related Tech:

  • discuss the git bash
  • discuss the function of URLs in relation to git
  • compare git to SVN
  • compare git to GitHub
  • discuss subgit
  • list git repo hosting services
  • list graphical git clients for Linux
  • discuss integrating git with Jenkins
  • discuss pushing a branch to GitHub

On Repositories:

  • define repo
  • define bare repo
  • define remote repo
  • define local repo

On Commits:

  • list the contents of commit objects
  • discuss staging areas (a.k.a. index, production area)
  • discuss commit messages
  • discuss why is it preferred to create a new commit rather than amend an existing one
  • list different ways to refer to a commit

On Branches:

  • discuss heads in git and specify how many can be created in a repo
  • discuss branching and its purpose
  • list the types of branching offered in git
  • discuss bringing a new feature to the main branch
  • list what is restored when a deleted branch is recovered
  • define long running branch
  • define Topic branch
  • define Release branch

On Conflicts:

  • define conflicts
  • discuss conflict resolution
  • discuss how to generate a list of all files creating merge conflicts
  • discuss how can conflicts can be resolved using only command line

On Tags:

  • discuss tags
  • list the types of tags supported by git

On Workflows:

  • discuss Centralized (Basic) Workflow
  • discuss Branching Workflow
  • discuss Forking Workflow
  • discuss Gitflow Workflow

On Basic Commands:

  • discuss git add
  • discuss git bisect
  • discuss git branch -d
  • discuss git checkout
  • discuss git cherry-pick
  • discuss git clone
  • discuss git commit
  • discuss git commit -a
  • discuss git commit -amend
  • discuss git commit -m
  • discuss git config
  • discuss git diff
  • discuss git init
  • discuss git instaweb
  • discuss git ls-tree
  • discuss git log
  • discuss git pull origin
  • discuss git pull origin master
  • discuss git push
  • discuss git rebase
  • discuss git reflog
  • discuss git remote add
  • discuss git reset
  • discuss git rm
  • discuss git stash
  • discuss git stash apply
  • discuss git stash drop
  • discuss git status
  • compare git clone to git remote
  • compare git diff to git status
  • compare git pull to git fetch
  • compare git rebase to git merge
  • compare git revert to git reset
  • name a command that provides a list of files that changed in a particular commit
  • name a command to remove a file from git without removing it from the filesystem
  • name a command to add new files to a git directory
  • name a command to list tags
  • name a command to annotate tags
  • name a command to write a require message
  • name a command to check git status
  • name a command used to assist in fixing a broken commit
  • name a command to revert a pushed commit
  • name a command to view commit history
  • name a command to track remote branches
  • name a command to squash many commits into a single commit
  • name a command to copy a commit from one branch into another (an example scenario would be to include a hotfix on a release branch into a branch in active development)
  • name a command to identify if a branch has been merged into Master
  • name a command for merging changes without using git merge

On Git Administration:

  • discuss how to install a git client on a machine
  • discuss the first step to establish a connection to git
  • discuss the contents of git hooks
  • discuss how you set up a script to run every time a repo receives new commits through a push
  • discuss how you configure a git repo to run code sanity checking tools right before making commits, and preventing them if the test fails
  • list data transfer protocols used by git
  • compare the function of an author to that of a committer

Closing Thoughts

I hope you’ve enjoyed this post as much as I enjoyed writing it. If you have thoughts you’d like to share, your editorial suggestions are always welcome. This post is part of a series on becoming a professional web developer. If you’d like to see more content like this, please click here to navigate to the table of contents.

--

--