Introduction to Git and Github

Megha Aggarwal
12 min readMar 2, 2018

--

Why Git?

When you have to do a project like a college project with your friends what do you do? I had used Google Drive, DropBox etc. Most probably you must also have use the similar methods. But don’t you think these methods are very cumbersome in terms of collaboration or you can say ‘A perfect reciepe of disaster’ as shown.

What is Git?

Git is an open source, distributed version control system designed for speed and efficiency

  • open-source: its source code is available free of cost
  • distributed version control: allows many software developers to work on a given project without requiring them to share a common network.
  • efficiency: on git since everything is local, working in git is highly efficient.

Some of the basic terms used while working with git are:

  • Repository: A basic folder or a collection of files that represents one project. The name of this repository is Git-Guide. When you clone, you clone an entire repository and every repository is identified by a unique URL.
  • Local Repository: The project folder which exists in your machine, locally. It is where you make your changes and push them to the github repository. No one can make changes to your local repository directly other than yourself, you need to sync (connect) this local repository with a repository on github (or any VCS system) in order to push and pull changes.
  • Remote Repository: This is the respository hosted on github (or any other VCS) to which your local repository is connected. You push your changes to this and others working on the same project can see your changes. Only those with write permissions to this repository can make direct changes to it. Many people can make changes to this repository and you can pull those changes to your local repository and push your own to this. A single local repository can be connected to multiple remote repositories. Remote repositories are referred by keywords like origin and upstream.
  • Fork: This is how you make a copy of a project owned by someone else. A person or organisation. Apart from the owner of the repository, no one is allowed to make direct changes to the project. So fork is used to make a copy of the project that is owned by you.
  • Clone: You got the project in your account, now what? A clone is just that, a copy. It does not care about ownership. It is aimed to bring the copy of the project hosted on github or any other VCS system in your machine. This is where you will make the changes and later update your remote project
  • Commit: This is a checkpoint in your project history. All the commits are recorded in git logs with the description provided by user. After you add, modify or remove any files, a commit is made to save these changes in history.
  • Push: This is how you send the changes made in your local repository to your remote. All your changes remain unsynced until you have pushed them and this is necessary step to keep the changes parallel. Only the files you commit(as in the previous definition) are pushed and rest of the changes remain local to your project.
  • Fetch: This is in very simple terms means to download any updates and changes from a remote repository. This does not mean that you have included the changes in your project. Just download.
  • Merge: This means to merge or combine updates fetched from remote repository with your local one. This may lead to merge conflicts if a change in remote is incompatible with a change in your local project.
  • Pull: This means to fetch any updates that may have occured in a remote branch in your local repository and merge them. This is basically a fetch followed by a merge.
  • Pull Request (PR): Pull Request or PR as it is generally known, is a method to contribute to open development projects by including bug fixes or adding new features. Its a way of contributing by asking the owner of project to include changes made in the external/forked repository.

Setup

In order to set up the user through terminal:

$ git config --global user.name <name>
$ git config --global user.email <email>

This is the user your commits and PR will be shown through.

First Project

  1. Go to github, create an account and then create a new repository using ‘+’ on homepage of github and then give your repository a unique name on your account.
  2. Now on your local machine, make a project by simply creating a folder or choose an existing one and change to your project folder using ‘cd’.
  3. Initialize git repository:
    $ git init
    After this command a file named '.git' is added to your project. This file stores details about your project history and is managed itslef by git.
  4. Lets check status of our repo:
    $ git status
    Files shown in red are the untracked file and files shown in green are the changes that to be committed in next commit.
  5. Lets add some files:
    $ git add <file_name>
    File name mentioned will be changed from red area to green.
    Note: to add all files use: $ git add .
  6. Commit all the changes:
    $ git commit -m <commit_message>
    Keep 'commit_message' as descriptive of the task you have done in commit.
  7. Add origin of your local repo or we can say setup remote of local repo:
    $ git remote add origin <remote_url>
    'remote_url' is same as the url of project you made on github in step 1.
  8. Push changes (commit) to github
    $ git push -u origin master
    '-u' intializes your branch, 'origin' is name of your remote and 'master' is the branch you are working on. You need this command only once, later you can only use $ git push

Working on already present repo

  • Cloning a repo: Go to github and get url of repo you want. Then, clone the repo in your machine:
    $ git clone <url_of_repo>
  • Use following command to push for first time:
    $ git push -u origin master

Making changes to repo that is owned by you or you are a collaborator to

  1. Make changes in your project and then stash those changes in a default stack provided by git called ‘stash’
    $ git stash
  2. Pull changes from remote repo or say, synchronize your local repo with remote repo
    $ git pull origin master
  3. Get your changes back from the stack
    $ git stash pop
  4. Add your changes to stash:
    $ git add .
  5. Make a commit of your changes
    $ git commit -m <commit_message>
  6. Push your changes to the remote repo
    $ git push origin master

Forking a Project and Making Changes

In order to work on an existing project that is not owned by you, follow the following steps:

  • Fork the project from the respective repository. This will redirect you to your fork of the project which is basically a copy of the original project but you are its owner.
  • Copy the HTTPS link of the project and go to the location through terminal or command prompt where you want to have the repository locally. Run

$ git clone <link to fork>

This will give you a local copy of the project which you can work with.

When a repository is cloned, it has a default remote called origin that points to your fork on GitHub, not the original repository it was forked from. To keep track of the original repository, you should add another remote named upstream:

  1. Open terminal or git bash in your local repository and type:
    $ git remote add upstream <url of original project>
  2. Run git remote -v to check the status, you should see something like the following:
    origin https://github.com/YOUR_USERNAME/project-name.git (fetch)
    origin https://github.com/YOUR_USERNAME/project-name.git (push)
    upstream https://github.com/OWNER_USERNAME/project-name.git (fetch)
    upstream https://github.com/OWNER_USERNAME/project-name.git (push)
  3. To update your local copy with remote changes, run the following:
    $ git fetch upstream $ git merge upstream/master As you may have guessed already, quicker way to the same would be $ git pull upstream master, because Pull is essentially a Fetch operation followed by Merge operation.
    This will give you an exact copy of the current remote, but make sure you don't have any local changes.

Setting up Upstream

When a repository is cloned, it has a default remote called origin that points to your fork on GitHub, not the original repository it was forked from. To keep track of the original repository, you should add another remote named upstream:

  1. Open terminal or git bash in your local repository and type:
    $ git remote add upstream <url of original project>
  2. Run git remote -v to check the status, you should see something like the following:
    origin https://github.com/YOUR_USERNAME/project-name.git (fetch)
    origin https://github.com/YOUR_USERNAME/project-name.git (push)
    upstream https://github.com/OWNER_USERNAME/project-name.git (fetch)
    upstream https://github.com/OWNER_USERNAME/project-name.git (push)
  3. To update your local copy with remote changes, run the following:
    $ git fetch upstream $ git merge upstream/master As you may have guessed already, quicker way to the same would be '$ git pull upstream master', because Pull is essentially a Fetch operation followed by Merge operation.
    This will give you an exact copy of the current remote, but make sure you don't have any local changes.

Branching and Pull Requests

Nearly every VCS has some form of branching support. Branching means you diverge from the main line of development and continue to do work without messing with that main line. In many VCS tools, this is a somewhat expensive process, often requiring you to create a new copy of your source code directory, which can take a long time for large projects.

Some people refer to Git’s branching model as its “killer feature”, and it certainly sets Git apart in the VCS community. Why is it so special? The way Git branches is incredibly lightweight, making branching operations nearly instantaneous, and switching back and forth between branches generally just as fast. Unlike many other VCSs, Git encourages workflows that branch and merge often, even multiple times in a day. Understanding and mastering this feature gives you a powerful and unique tool and can entirely change the way that you develop. Branches exist in git to enable you to work on different features simultaneously without them interfering with each other and also to preserve the master branch. Usually the master branch of your project should be kept clean and no feature should be developed directly in the master branch. Follow the following steps to create branches and be able to sync them:

  1. Make sure you are in the master branch: $ git checkout master
    Give the name of branch instead of 'master' to shift to that branch
  2. Sync your copy with remote copy: $ git pull
    "$ git pull remote_name branch_name" to sync the branch_name branch with your remote remote_name.
  3. Create a new branch with a meaningful name and shift to it: $ git checkout -b branch_name
    Use '$ git branch branch_name' to only create that branch.
  4. Add the files you changed: $ git add file_name (avoid using '$ git add .'). This is called "staging" the files.
  5. Commit your changes: $ git commit -m "Message briefly explaining the feature"
  6. Push to your repo: $ git push origin branch_name

This will push the changes you made to your fork on github under the branch name you gave.

To have the owner of the original project review your changes, create a Pull Request explaining the changes you made. If it is satisfactory, it will be merged with the original project.

Common Branch Commands:

  • $ git branch # will give you all the branches your local repository has.
  • $ git branch -a # will give you all the branches your local and remote repositories have
  • $ git branch -d the_local_branch # will delete a local branch that you had by the given name. Make sure you dont have any loose ends in the branch or a delete won't be allowed. After deleting the local branch, if you wish to delete the remote branch of the same name, use:
  • $ git push origin :the_remote_branch # but be careful while using this.
  • $ git branch -D branch_name # Force delete the specified branch, even if it has unmerged changes. This is the command to use if you want to permanently throw away all of the commits associated with a particular line of development.
  • $ git checkout -b branch_name origin/branch_name # to get the whole entire branch.
  • $ git branch --merged # see all merged branches
  • $ git branch --no-merged # see unmerged branches
  • $ git branch -v # see all branches and respective last commit
  • $ git merge branch_name # to merge the branch to master branch
  • $ git branch -m new_branch_name # to change name of the branch
  • $ git mergetool # default merge tool of git to resolve merge conflicts
  • $ git rebase branch_name_to_rebase_with # rebase current branch with given branch

Squashing Commits

Often it is required while contributing that your entire feature change is in the form of one single commit, this is where squashing comes in. Be aware of the type of commits you are trying to squash. There can be two:

  • Commits in your local repository
  • Commits you have already pushed to remote

First of all be sure to run git log and see your recent commit history, this will help you decide which commits you wish to squash. If for example you wish to squash the last two commits into a single one, run:

$ git checkout my_branch # to make sure you are on your required branch

$ git reset --soft HEAD~2 # where 2 represents the number of commits to be sqaushed, can be replaced with your need

$ git commit -m "New commit message to represent the squashed commits"

The above two instruction will locally squash the number of commits you chose to. In case you had already pushed the commits to your remote repository, run the following command to make your remote reflect the changes:

$ git push --force origin my_branch

This is a forced update that makes the remote repository squash the commits into one.

Fixing PR when other changes get Merged leading to Conflicts

This is in continuation with requirement of having a single commit in your PR, suppose you did some work and submitted a PR to the main repository but before your request could be merged, changes were made by other developers in the repository and now your PR presents merge conflicts. You could fetch the work, resolve merge conflicts and push again but that will lead to MERGE COMMIT that is undesirable at times. The flow below can be followed to prevent the same:

This assumes the following:

  • You have submitted a PR for a feature
  • Changes have been made to remote before your PR got merged and now there are merge conflicts between your PR and remote

The steps to follow:

  • Checkout your branch with the features required:
    $ git checkout my_branch
  • Fetch changes from remote:
    $ git fetch upstream
  • Stash local changes(if any):
    $ git stash
  • Rebase to the latest branch in upstream (let it be master in our case):
    $ git rebase upstream/master
  • The following two steps are required only if merge conflicts occur:
  • If any merge conflicts are occuring, fix them in the project then add the files by using:
    $ git add file_name or $ git checkout -- file_name #(if you dont want your local changes)
  • After the changes have been fixed run $ git rebase --continue
  • Now the remote changes have been applied and your work has been applied on top of it
  • Force push to your branch to update the PR by using
    $ git push --force origin my_branch

Resources:

Credits:

--

--

Megha Aggarwal

Software Engineer, Googler | Ex-Microsoft | Ex-Director, WWC-Hyderabad