Serving a model using MLflow
This is the sixth article and the final article in my MLflow tutorial series:
- Setup MLflow in Production
- MLflow: Basic logging functions
- MLflow logging for TensorFlow
- MLflow Projects
- Retrieving the best model using Python API for MLflow
- Serving a model using MLflow (you are here!)
Create environment
conda create -n production_env
conda activate production_env
conda install python
pip install mlflow
pip install sklearn
Run a sample machine learning model from the internet
mlflow run git@github.com:databricks/mlflow-example.git -P alpha=0.5
note: as pointed out by Mourad K in the comments. The above command will only run if your GitHub authentication is setup using ssh-keys.
Check if it ran successfully
ls -al ~/mlruns/0
Get the uuid of the model we just ran from the above command and serve the model.
mlflow models serve -m ~/mlruns/0/your_uuid/artifacts/model -h 0.0.0.0 -p 8001
Make inferences in a new terminal window. Go wild!
curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["alcohol", "chlorides", "citric acid", "density", "fixed acidity", "free sulfur dioxide", "pH", "residual sugar", "sulphates", "total sulfur dioxide", "volatile acidity"],"data":[[12.8, 0.029, 0.48, 0.98, 6.2, 29, 3.33, 1.2, 0.39, 75, 0.66]]}' http://0.0.0.0:8001/invocations
To make inferences using Python you can import the request library:
import requests
host = '0.0.0.0'
port = '8001'
url = f'http://{host}:{port}/invocations'
headers = {
'Content-Type': 'application/json',
}
# test_data is a Pandas dataframe with data for testing the ML model
http_data = test_data.to_json(orient='split')
r = requests.post(url=url, headers=headers, data=http_data)
print(f'Predictions: {r.text}')
The mlflow models serve command stops as soon as you press Ctrl+C or exit the terminal. If you want the model to be up and running, you need to create a systemd service for it. Go into the /etc/systemd/system directory and create a new file called model.service with the following content:
[Unit]
Description=MLFlow Model Serving
After=network.target
[Service]
Restart=on-failure
RestartSec=30
StandardOutput=file:/path_to_your_logging_folder/stdout.log
StandardError=file:/path_to_your_logging_folder/stderr.log
Environment=MLFLOW_TRACKING_URI=http://host_ts:port_ts
Environment=MLFLOW_CONDA_HOME=/path_to_your_conda_installation
ExecStart=/bin/bash -c 'PATH=/path_to_your_conda_installation/envs/model_env/bin/:$PATH exec mlflow models serve -m path_to_your_model -h host -p port'
[Install]
WantedBy=multi-user.target
Activate and enable the above service with the following commands:
sudo systemctl daemon-reload
sudo systemctl enable model
sudo systemctl start model
sudo systemctl status model
The above example is a very simple one. For complex models like Deeplab one needs to define the input and the output tensors during the model saving. Consider referring to TF Serving for this purpose. This Blog is a good guide for serving Deeplab model using TF Serving.
References:
https://thegurus.tech/posts/2019/06/mlflow-production-setup/