“Github” fundamentals (Clone, Fetch, Push, Pull, Fork, Open Source)
In the ‘Git Fundamentals’ article, we talked about how Git was invented, its architecture and the basic commands to get started. This is a follow-up article on getting to know the “Github fundamentals”, how it integrates with Git to complete Git’s Remote Repository workflow. We’ll also see what other features Github has on offer to make code sharing and collaboration a pleasing experience.
Reiterating my point to dispell the common myth, the Git and Github have separate identities of their own. Git is a version control system. Github is a code-sharing platform. But, they do work hand in hand to provide an integrated platform so it is easy to confuse them as one. Think of Github as the social network for programmers where they can share their work, and reach out to collaborators and team members.
Remote Repository
Coming back to the Git Architecture we saw in the last article, we didn’t talk much about the remote repository piece. Well, Github is, in most cases, the remote repository that is being used by individuals or organizations. The local repository and remote repository will need to be brought in sync with each other from time to time to let other team members get the latest code you wrote or vice versa.
Getting Started
To put your code on the remote repository, you would need to first go to Github.com and create an account. Then, you would need to create a repository there. Give it a name, and keep it public or private whatever suits you. As soon as the repository is created, in the “Clone and Download” section, Github creates one Web URL and one SSH address for you to link with Git.
Web URL
Typically, you would connect Git with Github with the Web URL. This would go the normal HTTPs route. You would be prompted to authenticate in Git to give your Github credentials so that the Git can connect with your Github repository.
SSH
If you are hesitant to type in your Github credentials in Git, then SSH is the alternative way for you. You would need to create an SSH key pair using SSH Keygen in Git. Then link the SSH public key with GitHub. Once the setup is done, you can use the SSH address to link to Git.
Linking Git with Github (Remote and Clone)
Moving on, now that you have your remote repository set up, you are ready to link Git’s local repository with Github’s remote one. There are two cases which can arise,
Local Repository Exists (Remote Keyword Flow):
If you already have a local repository, then you can connect it with Github using the remote command.
This command lets you do the following things,
- Add a remote repository name and its corresponding URL (web address or SSH address).
- Rename a remote repository name
- List down all the remote repositories
- Delete a remote repository
Local Repository doesn’t Exist (Clone Keyword Flow):
If the local repository doesn’t exist, you can simply use the Clone command which will clone the remote repository on your machine and create a local repository and working directory for you.
For those from the SVN background, Git cloning is similar to SVN checkout which brings all the code from the central repo to the local machine.
“Origin” Remote
You would come across the word ‘Origin’ a lot when you work with Github. Origin is the default ‘name’ or a ‘label’ of the remote repo in Git, which is pointing to the Github repo. Whether you use the Remote command or the Clone command to link Git to Github, the remote repo will be referred with, by default, as the ‘Origin’ name.
Decentralized Architecture
I want to stress this term once again at this point because a lot of people think by having GitHub as the remote repository breaks Git's decentralized architecture. Well, that is untrue.
The left one is SVN where every user has to sync their changes to the Central repo first, only then can another user sync it back to his machine.
In Git, any user can sync their changes directly with any other user, provided they both are talking on a valid remote repository communication channel (HTTPS or SSH). There is no central repository “by design”. The Github happens to be the central repository “by convention” because Github is an easily accessible and popular location to place the code for collaboration.
There is a big difference between what is enforced by design and what is enforced by convention. You are completely free to choose to sync your code directly with another user, bypassing the Github, and the Git architecture won’t whine back at you.
Sending Local code to Remote (Push Command)
Right then, we’ve created a remote repository, and link it to our Git. We are now ready to move our code onto Github. Just to note, remote repositories are also called “Upstream” by some developers.
This can be done using the Push command. Think of it as an Upload command. It lets us do the following things,
- Pushing only the commits to any remote repo.
- Pushing the tags to the remote repo.
- Pushing everything (the commits, tags) to any remote repo.
You can push changes to any branch on the Upstream, be it a Master or any other branch. We’ll be covering the concept of Master, and branching in general in the Git intermediate article.
Taking Remote code to Local (Fetch and Pull commands)
In a collaborative working environment, from time to time you would need to check if someone has committed any new changes to the project you’re working on, and update your working copy with those changes. This is called “Downstream”.
There are two ways you can perform this task, one is a cautious way (the fetch command), other is a way of nonchalance (the pull command). You can think of both of these as the Download commands.
Fetch
This command fetches the latest code commits from the remote and places them only in the local repository. The working directory and staging area remain unchanged.
This option is for those who want to first check what is the latest code changes that are made by their fellow developers by checking the diff between their working copy and the local repository.
Once they are satisfied with the changes, they will merge the local repository with the working copy using the merge command.
Pull
This command fetches the latest code commits from the remote and merges with everything in all Three Trees of Git.
You might end up in conflict after the Pull which you would need to resolve on your own.
Open Source Projects
There are tons of open-source projects which are published on Github on which you can contribute. You can clone any of the open-source repositories from the wild, and start making code changes in them. Github allows us to clone any public repository listed on its platform.
Pull Requests
Don’t you think it is scary if any random person can clone your repository and start making changes to it, and pushing it back? That’s why Github has introduced a mechanism called “Pull Requests”. You can’t send any code changes upstream without the repository owner’s knowledge and permission.
But hang on, why is it called “Pull request”? I want to “Push” my changes to the open-source project, not pull them down, right?
This is because Github thinks you are NOT the Subject in this workflow but merely a worthless Object. You are requesting the repository owner to pull the changes from your local repository.
To let the repository owner see those changes for him to make this decision, you would need to post them on Github. Surely, he’s not going to see it from your local repository or sync it with any other medium. This is where Forks come in.
Fork
Github allows you to create a Fork from any open source project. This will create a replica of that project in your account, and add a reference to the original owner’s repository.
Please note that the fork exists in your own Github account. You can do anything with it. Clone it locally. Push changes to it. Mold it as you will. The original owner is not concerned with it.
However, if you want to let the original owner know about these changes, and request him to “pull” them down (incorporate them) to his original repository, you would have to send the pull request to him from your forked repository.
The pull request can only be sent from the forked repository, otherwise, you would not see that option.
Well, that’s it, folks.
Git and Github have become an integral part of almost 80% of the software developer’s lives and will continue to impact the generations to come.
To build on this knowledge check out the follow-up article,