How to Deploy an AWS SageMaker Container Using TensorFlow Serving

In the previous articles we learned how to create a TensorFlow Serving container for SageMaker and how to push the container image to an ECS repository.

In the last article, we created a docker image named sagemaker-tf-serving and pushed it to ECR. We will use that image to create a SageMaker endpoint.

This example requires an AWS IAM execution role authorizing SageMaker to use another AWS services. If you do not have a role, this article teaches you how to create the execution role. I’m assuming that you have an execution role name SageMakerRole.

SageMaker Endpoints

A SageMaker endpoint is a fully-managed, highly secure, hosting environment. It supports auto scaling, one click deployments, automatic A/B testing, and model versioning.

An endpoint is composed of one or multiple models.

Creating a SageMaker Model

We will use the sagemaker-tf-serving image to create a SageMaker model. A SageMaker model is essentially a pointer to an ECS repository:

creating a AWS SageMaker model

In the script above, we created the model half-plus-three-v1 pointing the ECS image sagemaker-tf-serving. We are ready to create the endpoint.

Creating the Endpoint Configuration

creating an endpoint

The script above creates an endpoint config named half-plus-three-config-v1. It determines that the endpoint will be composed of one instance type ml.c4.large. And that will be deployed with the mode half-plus-three-v1 that we created previously.

After creating the endpoint configuration, we create the endpoint using the command aws sagemaker create-endpoint --endpoint-name ${ENDPOINT_NAME} --endpoint-config-name ${ENDPOINT_CONFIG_NAME} .

I usually add the suffix -v1 for both the model and endpoint config names. It makes it easier to create new versions of the model, which I will cover that in future articles.

Checking the Endpoint Status

After creating an endpoint. You can use the command aws describe-endpoint --endpoint-name half-plus-three to check the endpoint status:

$ aws sagemaker describe-endpoint --endpoint-name half-plus-three{
"EndpointName": "half-plus-three",
"EndpointArn": "arn:aws:sagemaker:us-west-2:369233609183:endpoint/half-plus-three",
"EndpointConfigName": "sagemaker-byoc-tf-serving",
"EndpointStatus": "Creating",
"CreationTime": 1532266772.13,
"LastModifiedTime": 1532266772.13
}

Next article will cover how to make predictions against a SageMaker endpoint!

Related Articles:

Did you use the container? Do you have questions or requests for more articles? Add comments below!

--

--

Márcio Dos Santos
ml-bytes - Code snippets on how to productionize Machine Learning models

Passionate about the engineering side of Machine Learning and training performance. Designed TensorFlow for AWS SageMaker. Opinions are my own.