MLops: My favorite Github project template for data science projects

Kais LARIBI
6 min readSep 4, 2022
source : unsplash.com- @yancymin

TLDR:

In this story, I am sharing a git project structure that I often use as a starting point for data science projects and talk about few packages that could help organize code. I also implement a basic version of a CI pipeline that allows to automatize code quality analysis.

Introduction

Setting up a Github repository from scratch is always a subject to reflexion when starting a new project. During my journey as a data scientist, I come across few packages and structures that made my working life easier. They guarantee code quality, organization and ease to move from experimentation and development to production.

Moreover, when working on a project in a team, setting rules on the Github project regarding code quality, coverage, reviews and files organization is a must. This being said, this story should be helpful not only for data scientist but anyone working on code in a data context that could be data engineers, ml engineers, or practitioners… aiming to producing a high quality deliverable.

In the following, I will be sharing the git project structure I often use as a starting point when building a git data science repository and discussing few tips and packages that I find useful to have a good quality code…

--

--

Kais LARIBI

Engineer/Data Scientist, passionate about aviation. I write mostly about data science, software engineering and technology.