Deploying an MLflow Model in Sagemaker
Recently Mlflow’s capabilities have been augmented with the ability to deploy MLflow models to Sagemaker. This is relatively easy, but still has some gotchas that tripped me up. This article is a step-by-step guide to deploying an MLflow model in Sagemaker.
Pre-requisites:
Configure your MLflow environment variables. For example, with our Infinstor free MLflow service, we set the MLFLOW_TRACKING_URI
environment variable to infinstor://mlflow.free.infinstor.com/
and we set the experiment ID to a suitable one, for example MLFLOW_EXPERIMENT_ID
is set to 2
. If your MLflow service supports authentication, then you must be logged in to the service. For example, the following command accomplishes this in our free MLflow service:
python -m infinstor_mlflow_plugin.login
Further, AWS credentials in ~/.aws/credentials must be set up correctly.
Model
I have a transformers mode that I fine-tuned in MLflow run 7–16420961991120000000062— this is the model that I want to deploy using Sagemaker. First, let’s make sure that this model can be deployed locally. The command that I use to do this is:
mlflow models serve -m runs:/7–16420961991120000000062/model
Here is the response:
mlflow models serve -m runs:/7–16420961991120000000062/model
2022–01–17 09:33:21,209–9792 — botocore.credentials — INFO — Found credentials in shared credentials file: ~/.aws/credentials
2022/01/17 09:33:22 INFO mlflow.models.cli: Selected backend for flavor ‘python_function’
2022/01/17 09:33:54 INFO mlflow.utils.conda: === Creating conda environment mlflow-751ecc3c22368dcf8c4f69f17296ddee1b3a0657 === 2022/01/17 09:35:13 INFO mlflow.pyfunc.backend: === Running command ‘source /home/jagane/anaconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-751ecc3c22368dcf8c4f69f17296ddee1b3a0657 1>&2 && gunicorn — timeout=60 -b 127.0.0.1:5000 -w 1 ${GUNICORN_CMD_ARGS} — mlflow.pyfunc.scoring_server.wsgi:app’
[2022–01–17 09:35:13 -0800] [10277] [INFO] Starting gunicorn 20.1.0
[2022–01–17 09:35:13 -0800] [10277] [INFO] Listening at: http://127.0.0.1:5000 (10277)
[2022–01–17 09:35:13 -0800] [10277] [INFO] Using worker: sync
[2022–01–17 09:35:13 -0800] [10284] [INFO] Booting worker with pid: 10284
You can now invoke this local deployment of the model using curl as follows:
$ curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["text"],"data":[["This is lousy weather"], ["This is great weather"]]}' http://127.0.0.1:5000/invocations[{"text": "This is lousy weather", "label": 0.0, "score": 0.6562787294387817}, {"text": "This is great weather", "label": 4.0, "score": 0.508899450302124}
Required AWS Resources
The following three AWS resources are required for deploying an MLflow model into your default VPC using Sagemaker.
- Suitable container stored in AWS Elastic Container Repository
- S3 bucket
- Role for Sagemaker
Required Resource 1: Create a suitable container and push it to AWS ECR
Run the command from the CLI:
mlflow sagemaker build-and-push-container --build --push -c for-sagemaker-deployment
If you now go to the AWS console -> Amazon Container Services -> Repositories, you will see a repository named for-sagemaker-deployment. If you click on the repository, an image with a specific tag will be present. In my case, I was using MLflow 1.22.0, so the tag indicates the same.
Required Resource 2: Create a bucket for Sagemaker to use
Use the AWS Console or the CLI to create a bucket. In this example, I create a bucket called jaganes-sagemaker.
Required Resource 3: Create an Execution Role for Sagemaker to assume
In this step, we are going to create an execution role for Sagemaker to assume. This role gives Sagemaker permissions to access our Sagemaker services, S3, Cloudwatch, and a few other necessary services. Go to the AWS Console -> IAM -> Roles page. Click on ‘Create Role’ and choose AWS Service, Sagemaker in the ‘Select type of trusted entity’ page. Here is a screen capture of the same:
Next, in the permissions policies page, choose ‘AmazonSageMakerFullAccess’ permissions, as shown below:
Finally, add tags if you need them and provide a name for the role. In this example, I used the name role-for-mlflow-sagemaker-deploy
Deploy Model
With the three resources created above, we are ready to deploy the model. Here is the command line I used for this:
$ mlflow sagemaker deploy --app-name xformer -m runs:/7-16420961991120000000062/model --execution-role-arn arn:aws:iam::612652722220:role/role-for-mlflow-sagemaker-deploy --bucket jaganes-sagemaker --image-url 612652722220.dkr.ecr.us-east-1.amazonaws.com/for-sagemaker-deployment:1.22.0 --region-name us-east-1 --mode create --instance-type ml.m5.large --instance-count 1 --flavor python_function
Parameters used in the above call:
- app-name: This name, xformer in my example, is what we will use while invoking the model in the next step
- -m: The MLflow run. The format must be runs:/<run-id>/model
- execution-role-arn: The ARN of the execution role created earlier
- bucket: The bucket that we created earlier
- image-URL: The container image created earlier using the command
mlflow sagemaker build-and-push-container
- region-name: AWS region for the model deployment
- mode: we use
create
here and options such asupdate
and delete are available - instance-type: Must be an AWS ML instance type
- instance-count: Number of instances to create
- flavor: MLflow model flavor
Invoke
Here is a small python script that I used to invoke this newly deployed model:
import boto3
import jsonendpoint=’xformer’client = boto3.client(‘sagemaker-runtime’)
response = client.invoke_endpoint(EndpointName=endpoint,
ContentType=’application/json; format=pandas-split’,
Body=’{“columns”:[“text”],”data”:[[“This is terrible weather”],
[“This is great weather”]]}’)
print(str(json.loads(response[‘Body’].read())))
And the response I got is:
[{‘text’: ‘This is terrible weather’, ‘label’: 0.0, ‘score’: 0.8280780911445618}, {‘text’: ‘This is great weather’, ‘label’: 4.0, ‘score’: 0.5088995099067688}]
That’s all folks!