MLFlow for MLOps

Alvaro Durán Tovar
Deep Learning made easy
4 min readJun 4, 2021

Recently I have been looking for different alternatives for mlops at my job. We end up choosing mlflow and with this article I want to summarise my findings.

Resources

By far the best way to learn about it is watching this three videos from databricks:

And here you have the documentation: https://www.mlflow.org/docs/latest/quickstart.html

Index

  1. What is mlflow?
  2. Server mode
    2.1. Experiments tracking
    2.2. Model tracking
  3. Client mode
    3.1. Tracking experiments
    3.2. Running models
  4. What else can you do with mlflow?

1. What is mlflow?

Basically it’s a library and through the library you can do a ton of things. It can be used as a client library to run experiments, execute projects, build models, etc. And is possible to run it as a server to receive experiment metrics, artifacts and even start a http server to serve a machine learning model… All from the same package.

2. Server mode

2.1 Experiments tracking

By running mlflow server you will start a http server that can receive tracking information to be stored on a database (sqlite, mysql, progresql…) and artifacts (google cloud, s3…). Example:

mlflow server \
--default-artifact-root gs://<bucket>/<folder> \
--backend-store-uri postgresql://user:pass@host:port/db

The UI offers very a nice and intuitive way to interact with the uploaded information.

Image from https://www.pye.ai/

You can compare different runs to find out what performed better using different models and/or different set of parameters.

Even includes a section for hyperparameter tuning.

2.2 Model tracking

For each experiment you can (and should) log artifacts including the model used. MLflow offers functions for automatically logging models to make it easier, but you can log anything you want (images, text, audio, binaries, parameters, metrics…). The cool thing about this models is that it adds a conda.yaml file to specify the dependencies. No more headaches with conflicting package versions or figuring out the setup of the author! (you might have headaches to setup the file correctly tho).

Another cool feature is the ability to upload a sample input and the schema for the expected input and output. If the schema is provided mlflow can validate it on inference time to reject invalid calls.

Image from https://docs.databricks.com/

From the above interface you can create a model. A model is just a promoted experiment. Models have a unique name, are versioned and have stages so you know which one is in production stage easily.

3. Client mode

3.1 Tracking experiments

Tracking experiments is as simple as doing this:

with mlflow.start_run() as run:
rfr = RandomForestRegressor(**params).fit([[0, 1]], [1])
mlflow.log_params(params)
mlflow.sklearn.log_model(rfr, artifact_path="sklearn-model")

This will record all parameters used to create the RandomForest (including the default values) and upload the model fitted with some extra parameters like the train set scores, oob score, etc.

After doing this multiple times you can go to the dashboard, select the model that performs better and promote it to a mlflow model.

3.2 Running models

Once you have a model you can load it with the command line executable:

mlflow models serve -m models:/model-name/version

The model might or not might contain a conda.yaml file. If it contains a conda.yaml all the dependencies and the correct python version will be installed automatically, that means the inference environment will be the same the research environment used to train the model.

4. What else can you do with mlflow?

This is just an introduction to the many features mlflow offers. There are many more things you can do with it:

  • Parameterized tasks
  • Execute jobs on kubernetes
  • Create docker files
  • Authenticated connection to the tracking server via http basic auth or ssl certificates
  • Dataset versioning

--

--