Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

What a Git: A Tale of Version Control

--

Ever written a document which you thought was the final version, but you knew that there would always be a “more final” version? Yeah, we can all relate…

Screenshot of an essay file with different versions
When does a list like this ever end?

From our high school essays to research papers in the university, we, at some point in our lives, have faced the challenge of having to keep track of different versions of our own work. Working on a collaboration with other people can turn this into a nightmare!

Among developers, tracking the versions and documenting the progress of a project has to be accurate and can be updated in real-time. Git is a dependable system for this purpose.

How We Got Git

Linus Torvalds, the man who wrote the Linux kernel, wrote the Git system in 2005. He was using another software for source control management (SCM) for kernel development but he wasn’t too happy with it. Andrew Tridgell, the creator of the Samba file system, had reverse-engineered the SCM that they were using at that time, and in ten days — Git was born. Being open-source, Git has evolved into what it is now: easy to use, can handle large projects, and supports non-linear development.

How to Git

In a Git system, we have directories that are used as repositories. Each repository can be treated as a separate project, and all the files needed for that project can be stored in the same repository.

Sometimes, a project can be so large (or exciting!) that there may be people contributing to it other than the main developer. If several individuals are contributing to the project, they can submit or push their changes to the main branch (also called the origin or master branch), and they can pull the latest changes into their local copy of the repository.

Another awesome thing you can do with Git is every time you want to make changes or add lines of codes into the main file, you can add a note or message with each commit (or “save” in Git lingo) to help remind you of the changes made. Each commit has a unique string of identifiers called hash to be able to track the changes accurately.

Git and GitHub

What is GitHub? It is an online hub of Git users where they can collaborate and maintain repositories without keeping any project file on their computer because these can be hosted on GitHub itself.

Logo of Github with the word GitHub beside it.
The GitHub mascot is not just a cat, but Octocat! (Check out its tail).

The GitHub Repo

Because pictures paint a thousand words and I type slow, I’ll be showing some screenshots of my GitHub repository and give you a tour of how this works:

Let us first create a repository. I’m also doing some exercises in my data science training (hello, FTW!) so I’ll be using the same repository and files for this article as well.

A screenshot showing how to create a repository on GitHub website.
The interface on the GitHub website makes it easy to create a repository.

You can see in the image below how an almost empty GitHub repository looks like. It only contains the README file which is usually created at the same time as the repository (if you tick the box as shown above).

A screenshot of my newly created, empty repository in GitHub.
My repository in GitHub. This is where we’re going to upload the project files.

Files added into the repository are displayed as shown below:

A screenshot of GitHub repository with files added into it.
Files can be added by clicking on the “Add file” button on the repository page or by connecting using the CLI.

Using Git on a Command Line Interface (CLI)

Some Git users may want to work on their computers, with a dedicated local copy of their repository. They may either be working on a private Git network or collaborating on the GitHub website.

Cloning a repository means copying the whole master branch of a project into your own repository.

Remember the three files in my GitHub repository? Those were all copied in my local repository using the clone command.
A screenshot of my local repository, showing the exact files that I have in my GitHub repository.
I now have a copy of those three files in my local directory, which shares the same name as my GitHub repository.

What if I wanted to add two more files to my repository? I can do that with the commit and push commands. Let’s take a look at the following images:

A screenshot of the terminal showing the git commit command.
Commit saves your changes.
A screenshot of the terminal showing the git push command.
Push command introduces the changes into the master branch.

How does it look like in my GitHub repository after all these changes?

A screenshot of my GitHub repository page, showing the newly added files.
The two files are now visible on my GitHub repo. It also displays the comment on my last commit and the hash of this commit (to the left of “4 minutes ago”).

Each update, each addition or removal of any part of the project by anyone contributing to the project is well-tracked. Anyone can contribute to any project by using the pull request feature of GitHub.

No wonder Git (and GitHub) has been quite useful to lots of developers, especially those involved in open-source projects. It is the perfect environment for any kind of development and free-flowing collaboration.

Drop by my GitHub. Happy version tracking!

--

--

Analytics Vidhya
Analytics Vidhya

Published in Analytics Vidhya

Analytics Vidhya is a community of Generative AI and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Roch Derilo
Roch Derilo

Written by Roch Derilo

A lover of data, tech, and hot choco. Supports anything open source and its power in the data value chain.

No responses yet