Supporting Arbitrary ML Models with MlFlow (Using PyCaret as an Example)

Benjamin Tan Wei Hao
DKatalis
Published in
3 min readMar 14, 2022

--

MlFlow model tackling various tasks.

MlFlow is awesome. We use it all the time to track our ML models and their artifacts. Logging model training runs are super easy, and if you need any custom logic, the API is also pretty easy to use.

The one small caveat to this is that the model should already be supported by MlFlow. See the following list for all the models that MlFlow natively supports.

However, what happens when you have to log a model that isn’t natively supported by MlFlow?

Enter mlflow.pyfunc.PythonModel

Thankfully, MlFlow exposes a base class that once you’ve implemented all the needed functions, allows MlFlow to treat your custom model (almost) like a native one.

Here, we have a PyCaret model that we need to log into the model registry, and that we would want to load later on. For our use case, we wanted to log the model from a Jupyter Notebook and later load it from a Kubeflow component.

While PyCaret supports MlFlow, it does this for a fresh model. In other words, you have to train the model from scratch. However, in our case, we had already a trained model that had to be saved into the model registry.

Create a mlflow.pyfunc.PythonModel class

--

--

Benjamin Tan Wei Hao
DKatalis

Author of The Little Elixir & OTP Guidebook, Mastering Ruby Closures, Building an ML Pipeline in Kubeflow. | Currently: Product Owner at @dkatalis.