The Herpetologist Social Scientist guide to Git for Data Science — Part 2
In the previous part of this tutorial I introduced Git, GitHub, and GitKraken, and how these tools can help your research work. In this part of the tutorial, we will see how to setup a GitHub and a GitKraken account, and how to create and manage a repository. As opposed to the previous part of this tutorial, this section will be mostly a follow-through tutorial, with step by step instruction on how to create, clone, and update a repository.
Sign up for GitHub
You can easily create an account on GitHub from the following page: https://github.com/signup. Follow all the procedure and confirm your account. You don’t have to download Git or the GitHub software, as we will use GitKraken for this.
Sign up for GitKraken
To sign up for a GitKraken account, follow the procedure reported here: https://gitkraken.link/giulio (you can also win a 100USD Gift card!) . After that, you can download the software from this page: https://www.gitkraken.com/git-client. Once installed, log in to your account and we are ready to start!
Create a repository on GitHub
To create a repository, visit www.github.com and once logged, click on the + icon in the upper right corner, then hit “Create Repository”. Give it a unique name (you can also use GitHub’s inbuilt repository name suggestions for this test, then you can decide if you want your repository to be Public (visible to anyone) or Private (visible only to yourself), add a readme file template, ignore some formats (this is done with the .gitignore file, we’ll get to it later), or add a license. Once you are done, hit “Create Repository”! Congratulations, you created your first repository! Here’s mine: https://github.com/Gabrock94/atestproject
If you initialize the repository with a README file and a license, your page should resemble the one depicted in the screenshot below:
As a note, you can also create a repository on GitHub directly within GitKraken.
Clone a repository from GitHub to your machine
To “Clone” a repository from GitHub to your machine we’ll use GitKraken. From GitKraken GUI select “Clone a repo”, then search for GitHub.com in the repository management pop-up window. After having connected to your GitHub account, the list of your repositories will be available in the dropdown menu. Select your repository, and set the path where you want the repository to be cloned. When you are ready, select “Clone the repo!”. Once done, a green button in the top part of the interface will show you the newly created repository.
Updating the repository
Now that our repository is initialized in our machine, it’s time to add some files to it, and sync the online version. Navigate to File/Open in File Manager (or hit ALT+O) to open the current repository in your system’s file manager. Now create a new file in your preferred method (Linux user’s can open a terminal within the folder and use “touch asamplefile.txt” to create an empty file). Now, head back to GitKraken: a green plus sign should have appeared on the top of your repository timeline. Clicking on it will show the changes to your repository, and you can stage all changes or select single files.
Stage all changes, provide a sample title and description, such as “asamplefile added” and “asamplefile.txt has been added”. The stage files to commit by clicking the green button. Currently, the commit has been done offline, but not on GitHub. To synchronize your online repository hit the “push” button on the top toolbar. Once done, changes will be reflected in your online repository.
To sum up:
To sum up, in this tutorial you have learned how to initialize a repository on GitHub, clone it in your local machine using GitKraken, and push a commit to the GitHub repository using GitKraken. This will allow you to host your data science projects on GitHub using a GUI.