Folder Structure for Machine Learning Projects

Simple steps to create an automated folder structure!

Surya Gutta
Analytics Vidhya
3 min readJul 22, 2021

--

Photo by Kevin Ku on Unsplash

Having a well-organized general Machine Learning project structure makes it easy to understand and make changes. Moreover, this structure can be the same for multiple projects, which avoids confusion. In this post, we will use the Cookiecutter package to create a Machine Learning project structure.

Step 1: Make sure that you have latest python and pip installed in your environment.

Step 2: Install cookiecutter

Step 3: Create a sample repository on github.com (e.g., my-test)

Note: Don’t check any options under ‘Initialize this repository with:’ while creating a repository.

Step 4: Create a project structure

Go to a folder where you want to set up the project in your local system and run the following:

If you run the above command multiple times (as part of practicing), it would ask you the following:

It will ask the following options:

Note: You can ignore the ‘s3_bucket’ and ‘aws_profile’ options.

Step 5: Add project to the git repository

The final structure will be the following:

Note: The data folder won’t appear in github. It will be in your local folder. This is not pushed to githhub as it will be in the ignore list (.gitignore file). If you want to checkin that also, just comment out in .gitignore file and add the data folder to github.

Thank you for reading! Please 👏and follow me if you liked this post, as it encourages me to write more!

--

--

Surya Gutta
Analytics Vidhya

Software Architect | Machine Learning | Statistics | AWS | GCP