TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

How To Structure Your Git Branching Strategy — By A Data Engineer

Data pipelines require version control too!

Nicholas Leong
TDS Archive
Published in
8 min readSep 20, 2021

--

Image by Author

If you’ve ever dealt with code collaboratively, you’d understand the importance of version control and branching strategies. These are the key tools that allow multiple developers to work on a project in parallel. Without them, your product is very likely to break.

For those who don’t understand what version control and branches are — In a summarized explanation, version control is the practice of managing changes to your source code. It allows developers to clone, work, and deploy code without interfering with other developers’ work.

Branches are simply versions of your source code. It is useful in separating code that is currently in development and actual working, stable code for production environments.

You’ve heard of the DEV, UAT, and MASTER branch for software engineers and developers. But have you ever come across a branching strategy for Data Engineers/Data Scientists?

Instead of a product, Data Engineers and Data Scientists build and maintain data warehouses. Data Scientists do build data products but are often not able to do so before establishing a stable data warehouse to gather data.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Nicholas Leong
Nicholas Leong

Written by Nicholas Leong

Data Engineer — Crunching data and writing about it so you don’t get headaches. 1M+ reads on Medium. https://www.linkedin.com/in/nickefy/

Responses (6)