Why you should track your experiments

Sabrina Horowitz
2 min readMay 17, 2019

--

Tracking experiments is not common among Data Scientists. However tracking have many benefits.

Tracking saves time

How many time did you rerun the training of a model with the same parameters again and again?

By tracking experiments (the activity to log parameters, results dependencies, code and data) you can go back in time and check which parameters, results and data was used to achieve the best result.

Tracking also saves you time by knowing which dependencies and version has been used. This is useful when you deploy your model and want to know which package is in production.

Tracking can also help you reproduce your results for a particular experiment in case you haven’t save your model.

Tracking save money

In the context of Deep Learning and large datasets, experiments can costs hundreds to thousands dollars in a cloud environment.

Tracking your experiments helps you to avoid retrain models with the same parameters and avoid the lengthy and costly training of large DL models

When training a DL model, it is helpful to track the model with the data from the experiment in order to avoid to reproduce the experiments.

Tracking improves collaboration

More and more Data Scientists collaborate by sharing code and data.

Unfortunately Data Scientists who get hold of theses data and code have to figure out the performance and the dependencies of the models.

By tracking and sharing experiments, other Data Scientists can figure out more quickly the behaviour of the models and avoid the pitfalls of code dependencies.

Tools to track your experimentations

Many tools exist today to track your experiments.

Mlflow (mlflow.org) : An open source platform for the machine learning lifecycle.

Sacred (github.com/IDSIA/sacred) : Sacred is a tool to help you configure, organize, log and reproduce experiments.

Comet (www.comet.ml) : Comet lets you track code, experiments, and results on ML projects. It’s fast, simple, and free for open source projects.

Datmo (www.datmo.com) : Open Source tool for tracking and reproducible Machine Learning experiments.

MokaML is a platform for Data Scientists to develop, experiment and deploy your models in the same place. Visit us at mokaml.ai.

--

--