Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Five Software Engineering Principles for Collaborative Data Science

10 min readJan 13, 2023

--

Photo by Louis Hansel on Unsplash

“Well-engineered data science code can help you unlock valuable insights.”

1. Use a standard and logical project structure

Photo by Max Komthongvijit on Unsplash

2. Make your environment reproducible with dependency management

Photo by Duane Mendes on Unsplash

Virtual environments

Making sense of Python virtual environment tools and workflows on YouTube: https://youtu.be/YKfAwIItO7M

3. Make your code reusable by making it readable

“Code is read much more often than it is written. ”

Elements of readable code: Jo Stichbury (2023) Public Domain

Don’t forget documentation

4. Refactor notebook code into pipelines

Benefits of pipelines

Testing, testing

“Move code out of notebooks into Python modules and packages to form pipelines as early as possible to manage complexity.”

Photo by Josué AS on Unsplash

5. Invest some time in mastering version control

Learn Git in 15 minutes on YouTube: https://youtu.be/USjZcfj8yxE

Summary

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Jo Stichbury
Jo Stichbury

Written by Jo Stichbury

Technical content creator writing about data science and software. Old-school Symbian C++ developer, now accidental cat herder and goose chaser.

Responses (1)