Serving a model using MLflow

Sumeet Gyanchandani
Analytics Vidhya
Published in
2 min readNov 6, 2019

This is the sixth article and the final article in my MLflow tutorial series:

  1. Setup MLflow in Production
  2. MLflow: Basic logging functions
  3. MLflow logging for TensorFlow
  4. MLflow Projects
  5. Retrieving the best model using Python API for MLflow
  6. Serving a model using MLflow (you are here!)

Create environment

conda create -n production_env
conda activate production_env
conda install python
pip install mlflow
pip install sklearn

Run a sample machine learning model from the internet

mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5

note: as pointed out by Mourad K in the comments. The above command will only run if your GitHub authentication is setup using ssh-keys.

Check if it ran successfully

ls -al ~/mlruns/0

Get the uuid of the model we just ran from the above command and serve the model.

mlflow models serve -m ~/mlruns/0/your_uuid/artifacts/model -h 0.0.0.0 -p 8001

Make inferences in a new terminal window. Go wild!

curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["alcohol", "chlorides", "citric acid", "density", "fixed acidity", "free sulfur dioxide", "pH", "residual sugar", "sulphates", "total sulfur dioxide", "volatile acidity"],"data":[[12.8, 0.029, 0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' http://0.0.0.0:8001/invocations

To make inferences using Python you can import the request library:

import requests

host = '0.0.0.0'
port = '8001'

url = f'http://{host}:{port}/invocations'

headers = {
'Content-Type': 'application/json',
}

# test_data is a Pandas dataframe with data for testing the ML model
http_data = test_data.to_json(orient='split')

r = requests.post(url=url, headers=headers, data=http_data)

print(f'Predictions: {r.text}')

The mlflow models serve command stops as soon as you press Ctrl+C or exit the terminal. If you want the model to be up and running, you need to create a systemd service for it. Go into the /etc/systemd/system directory and create a new file called model.service with the following content:

[Unit]
Description=MLFlow Model Serving
After=network.target

[Service]
Restart=on-failure
RestartSec=30
StandardOutput=file:/path_to_your_logging_folder/stdout.log
StandardError=file:/path_to_your_logging_folder/stderr.log
Environment=MLFLOW_TRACKING_URI=http://host_ts:port_ts
Environment=MLFLOW_CONDA_HOME=/path_to_your_conda_installation
ExecStart=/bin/bash -c 'PATH=/path_to_your_conda_installation/envs/model_env/bin/:$PATH exec mlflow models serve -m path_to_your_model -h host -p port'

[Install]
WantedBy=multi-user.target

Activate and enable the above service with the following commands:

sudo systemctl daemon-reload
sudo systemctl enable model
sudo systemctl start model
sudo systemctl status model

The above example is a very simple one. For complex models like Deeplab one needs to define the input and the output tensors during the model saving. Consider referring to TF Serving for this purpose. This Blog is a good guide for serving Deeplab model using TF Serving.

References:

https://thegurus.tech/posts/2019/06/mlflow-production-setup/

--

--

Sumeet Gyanchandani
Analytics Vidhya

Associate Director at UBS | Former Machine Learning Engineer at Apple, Microsoft Research, Nomoko, Credit Suisse | Master of Science in Artificial Intelligence