Data Science Project Structure Simplified!π
I recently came across the method of creating a project structure with so much ease, just using a single python file, yes, you heard it right, just a few lines of code and you are out of the dilemma of creating folders and files manually.
I would like to thank Mr. Krish Naik for sharing such information which helps budding data scientists (including me βοΈ)to upgrade their skills by βsubtle tweaksβ in their project implementation.
Obsolete method (which I will never ever use after this, and you too should not):
Previously, to be honest, I used to create all the folders of any data science project manually and accordingly its subfolders as well. If you are a data science practitioner, then you must be knowing that to maintain the code compatibility (modular coding) and following industry standards requires a folder structure, which of course, follows DRY (Donβt Repeat Yourself) principle.
New method (which I am going to adopt, and you too should):
When I started, I mentioned we will be creating all these using a single python file. So, letβs straight away take a look at how this works.
Step 1: Create your main data science project folder.
Step 2: Inside the folder, create a python file (.py extension) of your choice, I created it as βtemplate.pyβ
Step 3: Now just write the below mentioned code. If you are short of time, just copy and paste. (I would recommend you to write, as it will help you understand better).
import os
from pathlib import Path
import logging
logging.basicConfig(level=logging.INFO, format='[%(asctime)s]: %(message)s:')
project_name = "<your-project-name>"
list_of_files = [
".github/workflows/.gitkeep",
f"src/{project_name}/__init__.py",
f"src/{project_name}/components/__init__.py",
f"src/{project_name}/utils/__init__.py",
f"src/{project_name}/utils/utils.py",
f"src/{project_name}/logging/__init__.py",
f"src/{project_name}/config/__init__.py",
f"src/{project_name}/config/configuration.py",
f"src/{project_name}/pipeline/__init__.py",
f"src/{project_name}/entity/__init__.py",
f"src/{project_name}/constants/__init__.py",
"config/config.yaml",
"params.yaml",
"app.py",
"main.py",
"Dockerfile",
"requirements.txt",
"setup.py",
"research/trials.ipynb"
]
for filepath in list_of_files:
filepath = Path(filepath)
filedir, filename = os.path.split(filepath)
if filedir != "":
os.makedirs(filedir, exist_ok=True)
logging.info(f"Created directory: {filedir} for the file {filename}")
if (not os.path.exists(filepath)) or (os.path.getsize(filepath) == 0):
with open(filepath, 'w') as f:
pass
logging.info(f"Created empty file: {filepath}")
else:
logging.info(f"{filename} already exists")
Here, I am not going to go explain each step line by line as at the end of this blog I have attached a youtube video link, which you can watch and understand thoroughly.
Regardless, letβs get high level understanding of this code:
- We are just using simple python os module, combined with pathlib and logging module to create folders. Here a python list is created which consists of folder and file names in a βpathβ format so that it will be easy for pathlib module to render.
- And at last, we are just looping through each and every item in list and using them inside certain functions of the above stated modules to create the folders.
Important note : The βlist_of_filesβ list variable contains files and folders names depending on the project requirement. Some of them you will find common in almost all the projects and some may not be used. It all depends upon your project requirement.
After running the code:
Ta-daa!, you have crated the much awaited folders and files β
Thatβs it!
I hope you will definitely adopt this methodology and make your data science journey smoother and efficient!
The reference of this methodology was taken from the video attached below. Feel free to watch it and learn the industry standard end-to-end data science project implementation.
Link to video :
Thatβs from my side, keep exploring and learning.
Also, Read
- 10 Best AI Writing Tools
- Best Tattoo Ink
- Xresolver and 6 Best FREE Alternatives
- Best AI Productivity Tools
- Best Websites to buy backlinks
- Best logo designer ai tools
- Top 10 Website Builders For Small Businesses
- Top 10 Sticky Notes Chrome Extensions
- Free Marketing Tools
- Earning $100,000 per Month with AI Tools
- Best AI Tools for Web Designers
Follow our Social Accounts- Facebook/Instagram/Linkedin/Twitter
Join AImonks Youtube Channel to get interesting videos.