What Is Git? — Explore A Distributed Version Control Tool

Vardhan NS
Edureka
Published in
9 min readNov 16, 2016

Git is a free, open-source distributed version control system tool designed to handle everything from small to very large projects with speed and efficiency. It was created by Linus Torvalds in 2005 to develop Linux Kernel. Git has the functionality, performance, security, and flexibility that most teams and individual developers need. It also serves as an important distributed version-control DevOps tool.

In this ‘What is Git’ blog, you will learn:

  • Why Git came into existence?
  • What is Git?
  • Features of Git
  • How Git plays a vital role in DevOps?
  • How Microsoft and other companies are using Git

What is Git — Why Git Came Into Existence?

We all know “Necessity is the mother of all inventions”. And similarly, Git was also invented to fulfill certain necessities that the developers faced before Git.

What is the purpose of Git?

Git is primarily used to manage your project, comprising a set of code/text files that may change.

But before we go further, let us take a step back to learn all about Version Control Systems (VCS) and how Git came into existence.

Version Control is the management of changes to documents, computer programs, large websites and other collection of information.

There are two types of VCS:

  • Centralized Version Control System (CVCS)
  • Distributed Version Control System (DVCS)

Centralized Version Control System (CVCS)

Centralized version control system (CVCS) uses a central server to store all files and enables team collaboration. It works on a single repository to which users can directly access a central server.

Please refer to the diagram below to get a better idea of CVCS:

The repository in the above diagram indicates a central server that could be local or remote which is directly connected to each of the programmer’s workstations.

Every programmer can extract or update their workstations with the data present in the repository or can make changes to the data or commit in the repository. Every operation is performed directly on the repository.

Even though it seems pretty convenient to maintain a single repository, it has some major drawbacks. Some of them are:

  • It is not locally available; meaning you always need to be connected to a network to perform any action.
  • Since everything is centralized, in any case of the central server getting crashed or corrupted will result in losing the entire data of the project.

This is when Distributed VCS comes to the rescue.

Distributed VCS

These systems do not necessarily rely on a central server to store all the versions of a project file.

In Distributed VCS, every contributor has a local copy or “clone” of the main repository i.e. everyone maintains a local repository of their own which contains all the files and metadata present in the main repository.

You will understand it better by referring to the diagram below:

As you can see in the above diagram, every programmer maintains a local repository on its own, which is actually the copy or clone of the central repository on their hard drive. They can commit and update their local repository without any interference.

They can update their local repositories with new data from the central server by an operation called “ pull” and affect changes to the main repository by an operation called “ push “ from their local repository.

The act of cloning an entire repository into your workstation to get a local repository gives you the following advantages:

  • All operations (except push & pull) are very fast because the tool only needs to access the hard drive, not a remote server. Hence, you do not always need an internet connection.
  • Committing new change-sets can be done locally without manipulating the data on the main repository. Once you have a group of change-sets ready, you can push them all at once.
  • Since every contributor has a full copy of the project repository, they can share changes with one another if they want to get some feedback before affecting changes in the main repository.
  • If the central server gets crashed at any point of time, the lost data can be easily recovered from any one of the contributor’s local repositories.

After knowing Distributed VCS, its time we take a dive into what is Git.

What Is Git?

Git is a Distributed Version Control tool that supports distributed non-linear workflows by providing data assurance for developing quality software. Before you go ahead, check out this video on GIT which will give you better in-sight

Git provides with all the Distributed VCS facilities to the user that was mentioned earlier. Git repositories are very easy to find and access. You will know how flexible and compatible Git is with your system when you go through the features mentioned below:

What is Git — Features Of Git

Scalable:
Git is very scalable. So, if in the future, the number of collaborators increases Git can easily handle this change. Though Git represents an entire repository, the data stored on the client’s side is very small as Git compresses all the huge data through a lossless compression technique.

Secure:
Git uses the SHA1 (Secure Hash Function) to name and identify objects within its repository. Every file and commit is check-summed and retrieved by its checksum at the time of checkout. The Git history is stored in such a way that the ID of a particular version (a commit in Git terms) depends upon the complete development history leading up to that commit. Once it is published, it is not possible to change the old versions without it being noticed.

Economical:
In the case of CVCS, the central server needs to be powerful enough to serve the requests of the entire team. For smaller teams, it is not an issue, but as the team size grows, the hardware limitations of the server can be a performance bottleneck. In the case of DVCS, developers don’t interact with the server unless they need to push or pull changes. All the heavy lifting happens on the client-side, so the server hardware can be very simple indeed.

Supports non-linear development:
Git supports rapid branching and merging and includes specific tools for visualizing and navigating a non-linear development history. A core assumption in Git is that a change will be merged more often than it is written, as it is passed around various reviewers. Branches in Git are very lightweight. A branch in Git is only a reference to a single commit. With its parental commits, the full branch structure can be constructed.

Reliable:
Since every contributor has its own local repository, on the events of a system crash, the lost data can be recovered from any of the local repositories. You will always have a backup of all your files.

Easy Branching:
Branch management with Git is very simple. It takes only a few seconds to create, delete, and merge branches. Feature branches provide an isolated environment for every change to your codebase. When a developer wants to start working on something, no matter how big or small, they create a new branch. This ensures that the master branch always contains a production-quality code.

Free and open-source:
Git is released under GPL’s (General Public License) open source license. You don’t need to purchase Git. It is absolutely free. And since it is open-source, you can modify the source code as per your requirement.

Distributed development:
Git gives each developer a local copy of the entire development history, and changes are copied from one such repository to another. These changes are imported as additional development branches and can be merged in the same way as a locally developed branch.

Compatibility with existing systems or protocol
Repositories can be published via http, ftp or a Git protocol over either a plain socket or ssh. Git also has a Concurrent Version Systems (CVS) server emulation, which enables the use of existing CVS clients and IDE plugins to access Git repositories. Apache SubVersion (SVN) and SVK repositories can be used directly with Git-SVN.

What is Git — Role Of Git In DevOps?

Now that you know what is Git, you should know Git is an integral part of DevOps.

DevOps is the practice of bringing agility to the process of development and operations. It’s an entirely new ideology that has swept IT organizations worldwide, boosting project life-cycles and in turn increasing profits. DevOps promotes communication between development engineers and operations, participating together in the entire service lifecycle, from design through the development process to production support.

The diagram below depicts the DevOps life cycle and displays how Git fits in DevOps.

The diagram above shows the entire life cycle of DevOps starting from planning the project to its deployment and monitoring. Git plays a vital role when it comes to managing the code that the collaborators contribute to the shared repository. This code is then extracted for performing continuous integration to create a build and test it on the test server and eventually deploy it on the production.

Tools like Git enable communication between the development and the operations team. When you are developing a large project with a huge number of collaborators, it is very important to have communication between the collaborators while making changes in the project. Commit messages in Git play a very important role in communicating among the team. The bits and pieces that we all deploy lies in the Version Control system like Git. To succeed in DevOps, you need to have all of the communication in Version Control. Hence, Git plays a vital role in succeeding at DevOps.

Who uses Git? — Popular Companies Using Git

Git has earned way more popularity compared to other version control tools available in the market like Apache Subversion(SVN), Concurrent Version Systems(CVS), Mercurial etc. You can compare the interest of Git by time with other version control tools with the graph collected from Google Trends below:

In large companies, products are generally developed by developers located all around the world. To enable communication among them, Git is the solution.

Some companies that use Git for version control are Facebook, Yahoo, Zynga, Quora, Twitter, eBay, Salesforce, Microsoft, and many more.

Lately, all of Microsoft’s new development work has been in Git features. Microsoft is migrating .NET and many of its open-source projects on GitHub which are managed by Git.

One of such projects is the LightGBM. It is a fast, distributed, high-performance gradient boosting framework based on decision tree algorithms that is used for ranking, classification, and many other machine learning tasks.

Here, Git plays an important role in managing this distributed version of LightGBM by providing speed and accuracy.

Popular Question:

What is the difference between Git vs GitHub?

Git is just a version control system that manages and tracks changes to your source code whereas GitHub is a cloud-based hosting platform that manages all your Git repositories.

Why is Git so popular?

Git is a distributed version control system(VCS) that enables the developers to manage the changes offline and allows you to branch and merge whenever required, giving them full control over the local code base. It is fast and also suitable for handling massive code bases scattered across multiple developers, which makes it the most popular tool used.

This is the end of my article on Nagios interview questions. If you wish to check out more articles on the market’s most trending technologies like Artificial Intelligence, Python, Ethical Hacking, then you can refer to Edureka’s official site.

Do look out for other articles in this series which will explain the various other aspects of DevOps.

1. DevOps Tutorial

2. Git Tutorial

3. Jenkins Tutorial

4. Docker Tutorial

5. Ansible Tutorial

6. Puppet Tutorial

7. Chef Tutorial

8. Nagios Tutorial

9. How To Orchestrate DevOps Tools?

10. Continuous Delivery

11. Continuous Integration

12. Continuous Deployment

13. Continuous Delivery vs Continuous Deployment

14. CI CD Pipeline

15. Docker Compose

16. Docker Swarm

17. Docker Networking

18. Ansible Vault

19. Ansible Roles

20. Ansible for AWS

21. Jenkins Pipeline

22. Top Docker Commands

23. Git vs GitHub

24. Top Git Commands

25. DevOps Interview Questions

26. Who Is A DevOps Engineer?

27. DevOps Life cycle

28. Git Reflog

29. Ansible Provisioning

30. Top DevOps Skills That Organizations Are Looking For

30.Waterfall vs Agile

31. Jenkins CheatSheet

32. Ansible Cheat Sheet

33. Ansible Interview Questions And Answers

34. 50 Docker Interview Questions

35. Agile Methodology

36. Jenkins Interview Questions

37. Git Interview Questions

38. Docker Architecture

39. Linux commands Used In DevOps

40. Jenkins vs Bamboo

41.Nagios Tutorial

42. Nagios Interview Questions

43.DevOps Real-Time Scenarios

44.Difference between Jenkins and Jenkins X

45.Docker for Windows

46.Git vs Github

Originally published at https://www.edureka.co on November 16, 2016.

--

--