Are you Versioning your ML Models correctly?

Published in

Aruva.io Tech

2 min readApr 2, 2021

As Machine Learning is becoming more mature, new and improved models are continuously being developed, validated and re-iterated for an improved version

As required, these continuous improvements need to be correctly tracked, managed, and reproduced.

This reproducibility is required to help data scientists hop between multiple versions by simply switching "branches" until the desired results are met and the model fairs as expected against the various performance matrices.

Secondly or more importantly, is the adherence to Responsible AI. Models need to be auditable, reproducible and transparent to not only understand how the model works/performs but also to be auditable on lineage and evolution if required due to the nature of business (ex., financial models which could help you make millions of $)

Following the normal development, practices warrant the operational team to deploy the model in a production environment and reproduce the model deployment in multiple environments leading to the production.

Given the above, version control need not be applied only to the model's source code but all other facets as well.

Let's go a deep dive into these facets.

Implementation source-code: Let's get the obvious one out of the way. This is the actual code relevant to model development.

Assumptions: These are the assumptions around the problem at hand ex—the scope of the problem, the data, environment variables, other considerations.

Settings: Self-explanatory; model settings should be documented and versioned to reproduce a model execution within a specified environment.

Randomization: Consistent, predictable and idempotency pseudo-data generation, if used, should be documented to ensure the model behaves the same way irrespective of the # of executions.

Data: The data sets, including the training, evaluation and validation set, should ideally be versioned and persisted; however, the storage/versioning may be prohibitive depending on the size and structure of data. More mature tools like Xplenty could potentially be used in scenarios like these.

Results: In addition to versioning the input blocks for an ML model, the results, i.e. F1 score, confusion matrix etc. should be stored as well to equip the data scientists better to compare the performances across multiple versions

Libraries: The last on this list is the library versions ex. Python package versions, encapsulating application versions, runtime library versions, OS patches etc.

Given standard version control management like git could suffice for most of the facets mentioned above, the above provides a comprehensive list of what should be versioned and persisted for an efficient and robust Machine Learning Operations platform.

Are you Versioning your ML Models correctly?

Written by Manav