How to Deploy an AWS SageMaker Container Using TensorFlow Serving
In the previous articles we learned how to create a TensorFlow Serving container for SageMaker and how to push the container image to an ECS repository.
In the last article, we created a docker image named sagemaker-tf-serving
and pushed it to ECR. We will use that image to create a SageMaker endpoint.
This example requires an AWS IAM execution role authorizing SageMaker to use another AWS services. If you do not have a role, this article teaches you how to create the execution role. I’m assuming that you have an execution role name SageMakerRole
.
SageMaker Endpoints
A SageMaker endpoint is a fully-managed, highly secure, hosting environment. It supports auto scaling, one click deployments, automatic A/B testing, and model versioning.
An endpoint is composed of one or multiple models.
Creating a SageMaker Model
We will use the sagemaker-tf-serving
image to create a SageMaker model. A SageMaker model is essentially a pointer to an ECS repository:
In the script above, we created the model half-plus-three-v1
pointing the ECS image sagemaker-tf-serving
. We are ready to create the endpoint.
Creating the Endpoint Configuration
The script above creates an endpoint config named half-plus-three-config-v1
. It determines that the endpoint will be composed of one instance type ml.c4.large. And that will be deployed with the mode half-plus-three-v1 that we created previously.
After creating the endpoint configuration, we create the endpoint using the command aws sagemaker create-endpoint --endpoint-name ${ENDPOINT_NAME} --endpoint-config-name ${ENDPOINT_CONFIG_NAME}
.
I usually add the suffix -v1 for both the model and endpoint config names. It makes it easier to create new versions of the model, which I will cover that in future articles.
Checking the Endpoint Status
After creating an endpoint. You can use the command aws describe-endpoint --endpoint-name half-plus-three
to check the endpoint status:
$ aws sagemaker describe-endpoint --endpoint-name half-plus-three{
"EndpointName": "half-plus-three",
"EndpointArn": "arn:aws:sagemaker:us-west-2:369233609183:endpoint/half-plus-three",
"EndpointConfigName": "sagemaker-byoc-tf-serving",
"EndpointStatus": "Creating",
"CreationTime": 1532266772.13,
"LastModifiedTime": 1532266772.13
}
Next article will cover how to make predictions against a SageMaker endpoint!
Related Articles:
Did you use the container? Do you have questions or requests for more articles? Add comments below!