Version Control via Git
Remember the feeling you had when you created your first ‘Hello World’ program? Super exciting right! I can imagine, since then, you have made great strides in your learning and dove deeper and deeper into the syntax of your chosen programming language. As you’ve progressed, you’ve learned about loops, higher order functions, classes, inheritance patterns, etc. You eventually decide that you are ready to begin work on a complete application, and in doing so, you come to a point where you believe collaborating with other developers would be beneficial.
So what’s next? How do you send over your code to other developers so they can work on features for the application? Do you send your entire folder, have them make changes, and send it back to you? You could but then how would you know what changes were made and where? Later down the line, when something breaks, how do you know who’s to blame? How do you revert your code back into a working state? So many questions and so much STRESS!
Now imagine a world where there is a system that makes collaboration effortless and efficient, keeps track of who made what changes, and somehow has the power of a flux capacitor and is able to take your code back into a past, working state…
Great Scott! Imagine no longer. With a Version Control System (VCS) , you can do exactly that. In the scope of this article we will discuss what a VCS is, dive into Git, the most popular VCS in use today, explore its capabilities, and venture a bit into GitHub.
What is Version Control?
A VCS is a system that stores changes to a file(s) over time, allowing you to be able to refer to specific versions later. It involves taking snapshots of your file at different stages. This snapshot records information about when the snapshot was made, and also what changes occurred between different snapshots, allowing us to essentially rewind our file to an older version.
Benefits of Using a VCS
- Complete long term history of every file and its changes.
- Logs every change made by many individuals.
- Changes include the creation/deletion of files and edits to their contents.
- Ability to revert changes and move backwards through time to previous versions.
2. Branching & Merging
- Multiple individuals can work on a file simultaneously.
- Merge multiple versions of a file and manage any conflicts that arise.
- Allows for experimentation with different versions of a file while maintaining its original state.
Note: Although version control is particularly useful for collaboration, it is also beneficial for individuals working on solo projects by providing the ability to work on independent streams of changes.
3. Traceability
- Able to trace each change made to your project.
- Able to supply a message describing the purpose and intent of the change. When working with legacy code especially, this makes it easier to understand why the code was designed in such a way and allows for more harmonious changes in the future.
While it is obviously possible to develop software without the use of a VCS, doing so is an unnecessary risk that should be avoided. Let’s explore one of the most popular VCSs.
Git
Taken from Wikipedia,
Git is a distributed version-control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Its goals include speed, data integrity, and support for distributed, non-linear workflows
So the question arises, what makes Git any different from other VCSs and why is it the most popular of them?
1. How Git Thinks About Data
- Other VCSs store info as a set of files & the changes made to each file.
- In contrast, Git thinks of data as a series of snapshots of a filesystem of sorts. With every save (otherwise known as committing — to be discussed momentarily) to your file, it essentially takes a picture of what your files look like at the moment and stores it. If files have not changed, it will not store it, which saves on undue space usage.
2. Locality & Speed
- The whole history of your project is stored on your local disk. Because of this, when trying to browse the history of your project, Git responds almost instantaneously.
- Since almost everything is stored locally, most operations can occur offline.
3. Integrity
- Everything is checksummed (stored into a hash that can detect errors when used as a comparison:
61c3891747a3c82338ac995524e5d5958ec473b4
) before it is stored and is later referred to by that checksum. It is impossible to change the contents of a file without Git knowing about it. - Because this is built into Git at its lowest levels and part of its philosophy, it is nearly impossible to lose information in transit or due to file corruption.
4. Only Adds Data
- Nearly all actions add data to the Git database so it is hard to get the system to do anything undoable or make it erase data.
The Three Stages of a File
Pay attention here, as this is short but integral to your understanding of Git. Git has three stages that your files can be in: modified, staged, and committed.
- Modified: in this stage, your file has been modified in some way but has yet to be saved in your database.
- Staged: here, you have taken a modified file and signaled it to be part of your next save into the database.
- Committed: this signifies that your modified and staged file has now been safely saved in your database.
Now that we have a general understanding of Git and its purpose, let’s take a very brief look into GitHub, as it, or an alternative service, is used side-by-side with Git.
GitHub
Simply put, GitHub is a cloud-based hosting service that lets you manage Git repositories. If you have an open-sourced project that utilizes Git, GitHub is a means by which you can manage, store, and collaborate on it. If you haven’t already, create a profile on Github. In the below section where we explore Git commands, we will show a command that will link your Git project to GitHub. That’s it? Well, GitHub has a plethora of features that requires its own discussion, so for the sake of brevity, our exploration into GitHub will end here.
Git Commands
Git Init
This command will make a directory into a Git repository that can be added to and committed. Be sure to be in your desired directory, GitBlog in the case below, and run ‘git init’ to initialize a git repository.
Git Status
Using this command will show you the current state of your Git repository. If we look at the image below, we can learn a few things. Firstly, we can see what branch we are currently on (branch master here), we see that we have no commits as of yet, and that we have an untracked file. A prompt gives us a hint, or rather explicitly tells us what to do…
Git Add
From the image above, we are told that we can use Git Add to include our file in what will be committed. This is exactly what this command does. Below we used the command along with the name of the file to be added, to add to our staging area. Now if we git status we can see that our output is different. Now we are told that our index.html file is staged and ready to be committed. A prompt is also present notifying us that we may also unstage the file.
Git Commit
Commit allows us to save our changes into our database. When committing, as you can see, our identifier is sent along with it, which can be altered as signified by the prompt. At the very bottom, we get a message notifying us of how many files were changed and how many insertions/deletions were made to the contents of said files. Doing a git status at this point will reveal that there is nothing left to commit, that is, until more changes are made.
Git Branch
In order to find out what branch you are currently on, create a new branch, or delete one, you can use the git branch command. Below you can see how this works; we created a branch “alpha” then deleted it using this command. Remember, branches are an important feature in Git. When we create a new branch, we are able to work on our project and make any changes we want without having to worry about affecting the state of our master branch. Any changes made in alpha will NOT affect master unless additional steps are taken.
Git Checkout
This command goes hand-in-hand with git branch. If we want to begin work on a new branch, we can checkout to that branch and proceed working. Note that the branch name which is highlighted in green is our working branch.
Git Merge
As we create new branches to work on, we will eventually want to bring our changes into the master branch. We can do so by means of this command. Firstly, checkout the alpha branch, make some changes, add, and commit. After doing so, you must checkout back to the master branch (or whichever branch you wish to bring your merge into), and run the merge command with the name of the branch with which you wish to grab changes from. Similar to when we commit, we are notified how many files were changed along with the number of insertions and deletions.
Git Remote
This command will allow us to connect a local git repository to a remote one. One way this is used is in order to connect your repository to others’ repositories when collaborating on a project. The below example shows how this is done. We call the command with “add” signifying that we want to add a remote repository, “origin” which serves as the name of the remote, and then the link to the repository. Using the -v flag with the remote add allows us to see what remotes we are currently connected to.
Git Clone
In order to create a local copy of a remote repository to work on, we must clone it. Doing so will create a directory on our local disk with the contents of the repository being cloned. We run the command followed by the link to the remote repository, which can be followed by an optional name that will serve as the name of the directory to be created.
We could go on and on and on exploring Git commands but for the sake of brevity, we will cease here. Some Git commands that we did not explore that deserve an “honorable mention” are push, pull, log, and stash. I implore you to explore these yourself as they are important to your success utilizing Git. But don’t stop there! Explore the plethora of Git’s commands by use of a very ‘helpful’ command, Git Help.
Fin
Admittedly, this was a lot of information to retain…don’t be afraid to re-read this article and explore the web for more resources. Version Control Systems are important to your success as a programmer whether working on solo projects or in collaboration with others. Let this serve as your first foray not your last. Now Git outta here!