Benefits of the MLflow Model Registry

Taking a look at the shared model registry from InfinStor MLflow

Published in

InfinStor

5 min readOct 14, 2021

Another important capability of MLflow is the model registry, which is also essential for data science and building AI applications. This is a system where data scientists can incorporate and share models with other members.

Data science experiments, machine learning models, and the model registry are all connected.

Model management can be tedious in larger AI-driven enterprises, where there are numerous models and data teams. With the MLflow model registry, data scientists can easily incorporate and share models with others in their organization. Once a data scientist is finished running experiments, the model registry enables sharing within the team.

InfinStor streamlines the machine learning workflow with a shared model registry implementation with authorization. Here are some of the concepts of the model registry.

Components of a model registry

Taking models to production is already a sluggish process, especially without the help of platforms like InfinStor. A centralized repository for staging all models that are ready for production can accelerate the entire machine learning workflow.

According to MLflow documentation, the MLflow model registry component is a centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model.

With the MLflow model registry, everyone can have access to shared data, versions, and variables, all in a single accessible space. It, therefore, provides model governance and security for organizations.

“The MLflow Model Registry component is a centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model.”

The MLflow model registry provides features such as model lineage, versioning, staging, and annotations.

Model Lineage: The time a model was trained and with which algorithms
Model Versioning: The performance of a model and any hyperparameters
Model Stage: The current phase of the model and allows transitions
Annotations: The written descriptions of datasets and methodology

Users can register a model in the model registry with attributes such as model name, version, date, and more. MLflow has source control to keep track of the model versions as they are updated. Versions are useful for different levels of the machine learning workflow, such as the MLflow predefined staging phase, production phase, or archived phase.

Data scientists can annotate models and edit descriptions that may help others in their organization who are accessing the registry. This is a convenience in MLflow as it allows users to maintain the model versions, annotate different versions, and monitor each phase represented by the versions.

MLflow model registry documentation can be found here.

The Shared Model Registry

Data scientists can learn from and contribute to collective knowledge using the work of others in their company, and build on top of it. Just like with experiment tracking, enterprises can build collective knowledge with a shared model registry.

Services like InfinStor which are built for AI-driven enterprises can provide a shared model registry to data scientists.

A diagram of the InfinStor MLflow shared model registry at work.

Data science experiments produce models that can be registered with MLflow in the model registry and are shared with very fine-grained controls. Once the model is registered in the model registry, users can deploy different versions of the model for production, batch inference, or live inferences. All of this is managed in a very sophisticated manner with the InfinStor MLflow model registry.

InfinStor’s enterprise-grade MLflow service includes a shared model registry implementation. It is a multi-tenant and multi-user service.

1.  In run details, you can see that the run logged a model to MLflow
2. When you press the “Register Model” button, you have the ability to choose a registered model 
3. Shows the drop down with the models that have been registered. You can also create a new registered model
4. I’ve chosen ‘model-3-by-jagane’
5. After the model has been registered, you can see in the run details that the model that was logged to mlflow as part of the run has now been published as version 5 of the registered model model-3-by-jagnae

Conclusion

The MLflow model registry is essential for AI/ML, as having a centralized and collaborative repository for machine learning models is imperative for modern AI-driven enterprises to grow. MLflow’s model registry capability allows data scientists to collaborate on models with others in their organization.

“InfinStor MLflow provides security and scalability in an enterprise grade MLflow service.”

InfinStor MLflow has a hosted MLflow service with full open-source MLflow capabilities including tracking, projects, and serving.

Here are a couple of articles from InfinStor with additional details:

The Importance of MLflow Experiment Tracking

MLflow has a capability called experiment tracking which is essential for data science and building AI applications…

medium.com

Multicloud MLflow

An MLflow service integrated with a multicloud ML compute engine lets you run your experiments on diverse hardware…