DATA SCIENCE
How to use DagsHub for Data Science
A high-level overview of the DagsHub platform
The data science lifecycle encompasses the process from data collection, analysis, deployment and monitoring. But what is often overlooked is the underlying infrastructure that makes the entire lifecycle run smoothly and seamlessly.
This is especially true as data projects evolve over time as more and more data is collected, annotated and modified while models are built, optimized and re-built (as models drift and as models are being trained on new data). The challenge in machine learning reproducibility is in keeping track of the various loose ends of a data project, across different parallel versions of the project and with various members of a data team.
In this article, we’re going to take a look at how we can use the DagsHub platform for managing our data science projects.
What is MLOps?
As the process of building machine learning models in a typical project is rarely a one-time endeavor owing to the fact that models are incrementally evolving over time as more data are collected and annotated, models are built and re-built, etc. It is therefore not feasible to manually build models.