Created by Chanin Nantasenamat using the graphic by alexdndz on envato elements

DATA SCIENCE

How to use DagsHub for Data Science

A high-level overview of the DagsHub platform

Published in
7 min readJul 28, 2022

--

The data science lifecycle encompasses the process from data collection, analysis, deployment and monitoring. But what is often overlooked is the underlying infrastructure that makes the entire lifecycle run smoothly and seamlessly.

This is especially true as data projects evolve over time as more and more data is collected, annotated and modified while models are built, optimized and re-built (as models drift and as models are being trained on new data). The challenge in machine learning reproducibility is in keeping track of the various loose ends of a data project, across different parallel versions of the project and with various members of a data team.

In this article, we’re going to take a look at how we can use the DagsHub platform for managing our data science projects.

What is MLOps?

As the process of building machine learning models in a typical project is rarely a one-time endeavor owing to the fact that models are incrementally evolving over time as more data are collected and annotated, models are built and re-built, etc. It is therefore not feasible to manually build models.

--

--

Data Professor on YouTube | Sr Developer Advocate | ex-Professor of Bioinformatics | Join https://data-professor.medium.com/membership