The Google Drive of Code

Published in

simple CS

9 min readAug 28, 2020

If you’re a millennial, likely the first thing you did when assigned a group project in high school or college was create a Google Drive document. Google Drive revolutionized virtual collaboration. Gone were the days of emailing hundreds of versions back and forth — the versioning was handled for you. What’s more, you could all edit the document simultaneously.

As a computer programmer, it might seem beneficial to use a similar concept to Google Drive for code-sharing. It would ensure that everyone on the team was viewing the most up-to-date version of the code at all times. But such a concept, while extremely useful for word documents, would create issues in the world of coding.

Suppose you were wanted to write a function to print the world “hello” to the screen. Your function would probably look something like this.

But before you have a chance to test your function, another team member decides to change the “h” in hello to a capital “H”. If they are in the middle of editing the code when you try to test your function, it might look something like this.

The test would throw an error, because line 2 is incomplete. So a Google Drive approach to version control for code is not very useful. What we want is a version control system that:

Allows users to work on their own version of code independently
Allows users to merge their own version of code with the original code once they are finished editing

Git

Git, released for the first time in 2005 (seven years before Google Drive), is the most widely used version control system in the industry. It accomplishes these tasks in addition to offering lots of additional functionality.

Servers

In order to understand Git, it is important to understand servers. Servers are machines that communicate with several different clients. In this case, the client is your laptop, and the server is a large machine located somewhere else in the world. The server can communicate with your team members’ laptops (other clients) as well. In this way, the server acts as a “home base” or “single source of truth” for you and your team. The version of code on your client laptop needs to be sent, or pushed, to the server in order for your other team members to receive, or pull it. Google Drive uses the client-server model as well. In fact, the entire Internet uses this model, just on a larger scale.

Installation

Git is an application. This means it is itself a collection of code that needs to be downloaded and installed (placed in the correct location) on your computer. If you have a Mac OS, this has already been done for you by the engineers at Apple. If you have a Windows OS, you will need to install it yourself. I highly recommend installing MobaXterm to use as a command line interface, which automatically installs Git as a plug-in.

To confirm that you have Git installed, run the following command.

git --version

If you have a Mac OS, you should see an output similar to this.

git version 2.21.1 (Apple Git-122.3)

Getting Started

Git does not automatically version control all files on your computer. It only version controls the directories (or, repositories) that you tell it to.

The easiest way to do this is on GitHub. GitHub is a website that allows you to access the server, discussed above. If you create your repository on the server, all of the clients can easily access it right away.

Create an account on GitHub, then create a new repository, naming it as you would a directory on your computer. Once you have created a repository on GitHub (the server), you can clone it to your laptop (the client). This cloning process will create a directory on your laptop with the repository name you chose. To clone the repository from GitHub to your laptop, navigate to your repository page on GitHub and press the green button. Copy the url beginning with https:// to your clipboard.

Open a terminal on your computer and type the following command, replacing the url with the one you copied.

git clone https://github.com/melissamullen/data-structures.git

Press enter. Then run the ls command. You will find that a directory has been created with the name you chose for your Git repository. This empty repository is the first version of your project. You can tell Git to create a new version by making a commit. Enter the directory and create a new file containing a few lines of code. You should do this on the command line using Vim. If you aren’t familiar with Vim, I suggest you skim this article.

Once you have created a new file in your local repository (local means your client laptop, while remote means the server), run the following command, replacing file.py with then name of your new file.

git add file.py

From now on, any changes you make to file.py will be detected by Git. In addition, file.py is in the staging area, which means that the next time you create a new version of your project, file.py will be included in that version. To create a new version of your project, run the following command.

git commit -m “Initial commit”

The -m stands for message, and is an option that means the next argument in the command will be the version message, describing the changes you made to the project in this version. In this case, the message is “Initial commit”.

You have created for first version of your project! To view a list of all your project versions, or commits, run git log. The output will look something like this.

$ git logcommit ce0f73fc462021ceaec9c64e28f442daa03cd1bfAuthor: Melissa Mullen <44929819+melissamullen@users.noreply.github.com>Date:   Sat Aug 15 10:43:48 2020 -0400Initial commit

There is a unique key associated with each commit called the commit hash, which exists so that you can refer to the commit later if necessary. In this case, the commit hash is ce0f73fc462021ceaec9c64e28f442daa03cd1bf.

You now have a new version of your project that contains file.py. But this version only exists in your local repository. If you want your team members to be able to see your new version, you need to push it to the remote repository. You can do this by running git push.

Now, go back to the GitHub website and refresh the page. You should see file.py! When you ran git push, your new version of the repository containing file.py was sent from the client (your laptop) to the server (GitHub). Your team members can now view the new version of the repository on GitHub, and pull it from the remote repository to their local repositories. They can do this (assuming they cloned the project back when you did, before you ran git push) by entering their local git repository and running git pull. If your team mates have not yet cloned the project, all they have to do is clone it, and the new version will be cloned to their laptop.

Branches

One of the most powerful features of Git (that is not offered by Google Drive!) is the ability to create branches. A branch is a copy of the repository. Up to this point, we have been working on the master branch, which is the original copy of the repository. The master branch is regarded as precious — it is code that is functional and ready for release. As such, it is generally best to create a copy of the master branch, and have the team develop there. If we create a copy of the master branch, we say we are “branching off” of master. The easiest way to do this is on GitHub.

Click on the “master” dropdown and create a new branch. We will name this branch “develop”.

When you have created the develop branch, it exists only on GitHub and not yet on your laptop. To pull the new branch to you laptop, run the command git pull on your laptop in your local repository. Your laptop is now aware that the develop branch exists, but we say it is still “on” master, meaning that any changes you make to the files right now will be changes made to the master copy of the repository. To switch to develop, run the command git checkout develop . You are now on the develop branch, and any changed you make to the files will now be changes made to the develop copy of the repository. If you ever forget which branch you are on, run the command git branch . The output will be a list of all the branches your local machine knows about, with a * indicating the active branch.

If you are working with a large team, it is considered best practice to create a new branch off of develop every time you make changes to the code. The idea is that master and develop will always exist, but the branches you make off of develop will only exist temporarily — until you are finished making changes.

Merging

Once you are finished making changes and ready to combine your code with the code on develop, or merge, you can submit a merge request on GitHub and assign it to your lead developer. The lead developer can then review your code, make sure it is functional, then merge your branch into develop. After your branch is merged into develop, your changes exist on develop, so your personal branch is no longer necessary and can be deleted.

The merging process can be a frustrating one. We have two different versions of code, and we need to combine them into a single version that maintains the functionality of each version. Fortunately, Git handles this for us… usually. The difficulty arises when Git cannot handle the merge for us because there are merge conflicts. Merge conflicts occur when both branches have edited the same line of a file. It then becomes our responsibility to tell Git what we want this line to look like in the final version. Here is an example.

Suppose file.py on develop contains the following code.

print(“On develop.”)

Now suppose we have have checked out a branch called issue-1 from develop. We have edited file.py on issue-1 to read

print(“On issue-1.”)

and another team member has edited file.py on develop to read

print(“On develop…”)

We are ready to merge back into develop, and we can do this by switching back to develop (git checkout develop) and running the following command.

git merge issue-1

Since the same line was edited on both branches, Git does not know which version to use in the merge, and we see are notified of a merge conflict.

Auto-merging file.py
CONFLICT (content): Merge conflict in file.py
Automatic merge failed; fix conflicts and then commit the result.

The output tells us that the merge conflict occurs in file.py. To resolve the conflict, we open file.py and notice it looks strange.

<<<<<<< HEAD
print(“On develop…”)
=======
print(“On issue-1.”)
>>>>>>> issue-1

We see both versions of the line, separated by ======= . Everything between <<<<<<<< HEAD and ======= is the code on develop, while everything between ======= and >>>>>>> issue-1 is the code on issue-1. We need to decide what we want the line to look like. Suppose we decide we want the line to read print(“Resolving merge conflict!”) . Then, we need to delete all of the lines in the file and replace then with that line. Then run git add file.py, git commit -m "Resolving merge conflict." , and git push. You should now be able to refresh GitHub and observe this change reflected in file.py there. At this point, you can delete issue-1 on GitHub by clicking master > View all branches > Trashcan, and on your local machine by running git branch -D issue-1.

Final Thoughts

While there is a lot of additional functionality we could cover, this article offers a good introduction to Git that will help you get started. A final command that you may find helpful is git status , which displays your staging area. It is instructional to run this command before and after you run the command git add file.py . You will notice that before you run git add, file.py is red and marked as “Untracked”, but after, file.py is green and marked a “New file”.