Lesson no.1 of Git and Github
An Intro to Git and GitHub for Beginners
To finally understand what the heck are those engineers are talking about!
Engineer A: “Hey, have you merged your branch to develop yet?”
Engineer B: “No, I’m waiting for my previous PR to be merged so I can proceed. But I’ve already staged all my works and pushed.”
Engineer A: “Ok, I’ll review and merge it later. Don’t forget to rebase!”
Engineer B: “Thanks, I almost forget that. In the meantime, I’ll work on another branch.”
You: “Huh? Were you speaking English? So what’s the progress now??”
Does it sound familiar? This is exactly what happened to me when I first met these sounds-high-tech terminologies. I genuinely thought PR referred to public relations and it just made the whole conversation nonsense.
This story aims to help you to avoid those awkward moments, so you won’t look as dumb as I did. Let’s crack on!
What is the problem Git is trying to solve?
To understand something thoroughly, we first need to know why it exists, what drove people to develop this solution.
In every kind of project, be it web development, dashboard design, data analysis — you name it, it revolves around collaboration, i.e. there are multiple workers in the project. We want to monitor people’s work in hope of minimising the chance of messing up the system, or the work we’re developing. If someone is going to change something, we don’t want him to just change it. Ideally, he should make a copy of the whole system and work on it, so his work is isolated, and someone else should review his work before making the changes to the system permanently.
However meticulous the process is, people are still likely to make mistakes. People make mistakes! People forget things! That’s what makes us human! If something bad really happens and the whole system is messed up, we need a time machine that can bring us back to the good old time when everything was fine. This is where Version Control System comes in.
Version Control System
Remember our old friend Super Mario? Whenever we finally successfully passed a level, the first thing to do was always ‘SAVE!”. So when we run out of lives, we don’t need to start from level 1.
Version control system is sort of doing the same thing. It keeps track of each stage(version) of the working progress, if something goes wrong, we can easily revert to a previous stage we want. For example, let’s say the initial stage of a project is version 1. Someone did some changes and made a version 2. One day you found that those changes are faulty and they’re crashing the system, you can instantly revert it back to version 1. Keep tracking each version of our project, from the beginning to the very end, makes it easier to fix some catastrophic disaster and synchronise everyone’s works smoothly.
Git is one of the many version control systems and it is the most ubiquitous one. Other popular version control systems for instances are Bitkeeper, CVS, Subversion, SVK, etc.
What’s the difference between Git and Github?
The first thing to know is that Git and Github are 2 completely different things. Git is a version control tool that enables us to track and maintain our working progress.
Github, on the other hand, is a cloud-based service that stores git repository. It provides a service that allows you to host your source codes, which have to be using Git, on the cloud. The purpose of this is to make sharing works with others more effortless. You don’t necessarily need a Github account to use Git, but you do need Git to use Github.
There are many other alternatives such as Gitlab, or BitBucket, but Github is the largest one in the world. When we are working in a repository, these cloud-based repository storage places are referred to as ‘remote’.
Before diving into Git, there are two more fundamental terminologies and concepts we better understand first. They are ‘repository’ and ‘branch’.
Repository/Repo
A repository is where we store all of the files that relate to the project. You can think of it as a folder that contains all of the codes, instructions, introductions, data, etc of the project.
Branch
Branch is one of the core concepts of Git. I’d like to think of it as parallel universes. The default name of Git’s branch is ‘master’, which is usually deployable and used as the production branch, although not always the case. In this master universe, the codes are ready to be shown to the world. This is where the public lives.
Next to the master universe, there is a ‘develop’ universe, where all the developers coordinate their works, either adding some new functionalities or fixing bugs. The develop universe will synchronise with the master universe once in a while, to make sure the latest and good-to-go developments are deployed to master. This is the universe where all workers live together.
All workers have the ability to create their own universe. If you wanted to build a skyscraper, you have to create your universe first and then build it there. Once you’re done, you need someone else’s approval to transition, or ‘merge’, your skyscraper to the develop universe.
The purpose of these parallel universes is obvious, to isolate each universe. If one universe is destroyed by Thanos, we still have others. Or we can always use the time stone to reverse back. If you don’t know who Thanos is, we’re probably in different universes.
Git Workflow
Now we can finally get to the juicy part, the Git workflow. A simplified version looks like below:
- Clone the repository from ‘remote’
- Create your own branch
- Work on your own branch, make changes
- Rebase, or merge the latest develop branch
- Commit your changes
- Push your commit to remote
- Create Pull Request
- Merge to develop
Clone is like making a copy of the project from the cloud to our local machine. Or you can think of it as downloading all the files we need. When we clone the repository to our local machine, it automatically creates a git repository in our machine as well. You can verify it by seeing a hidden file named ‘git’ in the folder you just cloned.
Now you have the develop branch on your local machine. Next, you need to create a branch to work on. Let’s call it ‘feature’. In this branch, no matter what changes we make, even if we delete all files, it won’t affect the main develop branch at all as long as this branch is not merged. Note that we are doing this on our own local machine, therefore, there won’t be a ‘feature’ branch appear on remote. The point where we branch off from the develop branch is called the ‘base’ of the branch.
Commit is the step that saves the changes we have made to the branch. Merely saving the files will only change the file, the ‘feature’ branch is not changed. In order to apply those changes to the branch, you need to commit.
Now everything seems fine and we’re ready to ask someone else to apply our changes to the develop. We normally don’t do it ourselves, otherwise there’s no point in branching and isolating our works. The other thing to consider is that during the time when we were working on our branch, the develop branch might have changed as well. We need to make sure our base is the latest develop, not the older version which we branched off to preserve consistency.
And because the develop branch is only updated in remote, it does not exist on your machine, we need to ‘pull’ the latest develop from remote to local. Think of this step as ‘updating’ your local develop branch to the latest version.
Once you pull the latest develop, you can then use ‘rebase’ to move the base of your branch from the older version of develop to the latest version. Merge the develop to your branch can achieve the same goal, but here let’s just focus on rebase. The takeaway is to understand why we are doing this. If you’d like to know the difference between merge and rebase, check The Difference between Rebase and Merge.
When everything is set, you have built your changes upon the latest develop, you’re ready to ‘push’ your branch to remote. When you created your branch on your local machine and work on it, they only exist in your machine. We need to ‘push’ your branch from local to remote so others can see and review your branch.
Finally, create a pull request (PR), requesting someone else to ‘pull out the changes in your branch and merge them to the develop’. It’s like submitting your assignment and someone in the team will review it. Once it is approved, it can be ‘merged’ to the develop branch, i.e, your skyscraper can be built in the develop universe.
If you’d like to dive deeper into the Git workflow, this is a fantastic article. Now you understand most of the git terminologies as well as the workflow. If you just don’t want to look dumb in front of those nerdy engineers, these should be enough to impress them. Next, we’ll be looking at some hands-on practices of using Git. Stay tuned!
My Git series stories: