A Quick Guide for Making a Git Ignore (.gitignore) File for Jupyter Notebook Based Projects/Repositories

Chad Ruble
3 min readDec 6, 2018

--

What the heck is a .gitignore file? This is a file that contains every file (or file type, or directory) that will remain untracked by Git and not be included in your repository.

A .gitignore file is dead simple to implement and will make your repository cleaner and more secure. For now, this quick guide will focus on repositories that include Jupyter Notebooks, but for most projects, you will find that there are certain types of files that you don’t want to include in your repo. After making some changes to your notebook, there are actually two files that have been updated.

The project file (notebook-project.ipynb) is your main project notebook and presumably you want to save those changes. Jupyter Notebooks have robust autosaving as well as a handy “checkpoint” function that allows you to rollback to a previous version of your notebook (not unlike a git commit). They are stored locally in a hidden directory .ipynb_checkpoints/.

First, many libraries automatically create a .gitignore file when a new project is instantiated, but if you are making your repository from scratch, you’ll want to add one of these as well. While in the main directory of your repository type touch .gitignore to make the empty file (no file extension is required).

Then paste the following code into the file.

.ipynb_checkpoints
*/.ipynb_checkpoints/*

The wildcard * operator in the second line insures that every parent folder and child file or folder around the .ipynb_checkpoints file will be ignored. .gitignore files are particularly useful when you are storing sensitive information like passwords and private API keys in your repository (probably in a .env file).

Now, git will ignore those files forever. But its important to do that at the very beginning of your project. It’s important to remember that Git will only ignore files in this file after it exists. So if you add it to a project, it won’t track future versions of those files, but they are accessible in previous commits, which are easy to find in a public repository.

For a typical data science project using a Jupyter Notebook, adding the checkpoint directory keeps your repository cleaner. It’s also just a great habit to get into (much like a .env file, but that’s for another post). I hope this helps!

There are resources where people share their .gitignore files as well as online tools to create .gitignore files after entering the language or libraries you are using in your projects. One of these is gitignore.io.

--

--