Deploy TensorFlow-hub models on Amazon Sagemaker
TensorFlow deployments made easy
In this article, we will take a look at TensorFlow hub models and how to deploy these models locally as well as on the AWS cloud.
Introduction
Tensorflow-hub is a repository of highly useful, pre-trained machine learning models. These models are freely available and can be utilized with Keras or TensorFlow.
These models are easy to deploy via TensorFlow servings. Tensorflow servings are designed to deploy machine learning models easily on production environments with an easy API interface. Tensorflow models provide out-of-the-box integration with TensorFlow servings.
Local execution
We will use the image to test TensorFlow servings locally first. We will start with one of the most used models ie Google — universal sentence encoder for example.
- Download and install docker. Pull TensorFlow/servings images from here.
- Download google universal sentence encoder 4 from here.
- Extract the model from the zip file
- The folder structure looks like this-
5. To use any model with TensorFlow servings it should follow the following rules
6. Create a directory named “models” and subdirectory with the name of the model like “sentence-encoder”
7. Create a subdirectory with a version number. By default, it will serve the model with the highest version number.
8. Inside the subdirectory copy all the model files.
9. Your final directory structure will look like this -
10. Now we can start TensorFlow servings on our local docker container. Run the command -
docker run -t — rm -p 8501:8501 -v “<ABSOLUTE_PATH>/models:/models” -e MODEL_NAME=”sentence-encoder” tensorflow/serving
12. Now you can send requests and use the model as a Rest-API endpoint. If the request is successful then you will get an encoded sentence as a result.
curl -d '{"instances": ["sample data"]}' -X POST http://localhost:8501/v1/models/sentence-encoder:predict
Sagemaker execution
The Sagemaker allows you to bring the models which are trained outside of the Sagemaker environment. So, one way is you could compress the model with the created directory structure in tar.gz format. Once ready, you can upload the zip to an S3 bucket and pass the s3 location while deploying.
But the issue with this approach is you need to download the model, extract and change directory structure, again compress the model and upload to the s3 bucket manually. This causes a lot of overhead and it’s very inefficient.
But there is another option!
The Sagemaker training jobs allow you to do some processing and save the model artifacts back to the s3 bucket after completion. You can provide your custom code for training jobs using Sagemaker script mode.
Pre-requisites
We need to set up some services of AWS for this.
- Create an S3 bucket.
2. Create a directory called — “trainingData” and a subdirectory called — “train”
3. Upload any CSV file under the train directory with the name “tranin.csv” . We just want it as a placeholder.
4. Create a Sagemaker notebook instance & attach the policy of “AmazonS3FullAccess” to the default Sagemaker role.
5. Create a new Sagemaker notebook with kernel — “conda_amazonei_tensorflow_p36”
6. Create a new text file and save it with the name “encoder.py”
Training script
Sagemaker provides script mode execution where the user provides his own script to perform the training job. Let's add the script in “encoder.py”.
Encoder.py →
"""This module is responsible for creating training job and
saving the model back to the s3"""import argparse
import os
import subprocess
import sys
import tensorflow as tf# The tensorflow hub is not installed for sagemaker tensorflow image
# Therefore we need to install the package programmatically
def install(package):
"""
Function to install python package
Parameters:
package (str) : name of package to be installed
Returns:
None
"""
subprocess.check_call(
[sys.executable, "-q", "-m", "pip", "install", package])# Installing tensorflow-hub package and importing it
install('tensorflow-hub')
import tensorflow_hub as hub# When a training job is created then
# we can provide a script and execute the training job in script mode.
# This script should be passed as an entry_point while creating a training job.
# The main method should dictate the flow of execution.
if __name__ == '__main__': # To extract command line options if passed any.
# we can use argparse package
# Here we are creating an ArgumentParser object.
parser = argparse.ArgumentParser() # Data, model, and output directories these variables
# are set as environment variables by sagemaker.
# We need to extract these variables and
# add it to ArgumentParser object
parser.add_argument(
'--output-data-dir',
type=str,
default=os.environ.get('SM_OUTPUT_DATA_DIR'))
parser.add_argument(
'--model-dir',
type=str,
default=os.environ.get('SM_MODEL_DIR'))
parser.add_argument(
'--train',
type=str,
default=os.environ.get('SM_CHANNEL_TRAIN'))
parser.add_argument(
'--test',
type=str,
default=os.environ.get('SM_CHANNEL_TEST'))
# Extracting all known arguments and storing it in args variable
args, _ = parser.parse_known_args() # Now we need to load the required tensorflow-hub module.
# Once loaded into memory we get the model as return value
model = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4") # The model need to be saved as per
# sagemaker tensorflow servings format
# Notice we are saving in directory structure
# like <model_name>/<version_number>
# This method will save the output model to S3 bucket
tf.saved_model.save(
model,
os.path.join(
args.model_dir,
"sentence-encoder/1"))
Jupyter notebook
First, let's set up S3 configurations.
s3_bucket = '<Bucket_name>'
prefix = 'sentence-encoder' # model will be saved under this prefix
training_data_prefix = 'trainingData/train' # training data location
Next is to import all required packages.
from sagemaker.tensorflow import TensorFlowfrom sagemaker.tensorflow.model import TensorFlowModelfrom sagemaker import get_execution_rolefrom sagemaker.inputs import TrainingInputfrom sagemaker.predictor import JSONSerializer, Predictor, JSONDeserializerimport sagemakerimport timeimport boto3
Next, we will create a TensorFlow estimator
tf_estimator = TensorFlow(entry_point='encoder.py',
role=get_execution_role(),
instance_count=1,
instance_type='ml.m5.large',
framework_version='2.0.0',
sagemaker_session=sagemaker.Session(),
output_path="s3://{}/{}/".format(
s3_bucket,prefix),
py_version='py3')
Next is to set up training data as a placeholder. It will not actually process that data for training.
train_input = TrainingInput("s3://{}/{}/".format(s3_bucket,
training_data_prefix),
content_type="text/csv")
Now, just start the training job by calling the fit method.
job_name = 'sentence-encoder-training-job-' +
time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())print(job_name)tf_estimator.fit({"train":train_input}, job_name=job_name)
This will start a training job with a Tensorflow-2 image. Once ready, it will execute the training script. The script will
- Install TensorFlow-hub
- Download universal sentence encoder model
- Save the model into the desired directory structure
- Sagemaker will save model artifacts to s3 bucket output path in -
s3://<bucket_name>/<prefix>/<job_name>/model.tar.gz
Now all artifacts are ready. With these stored artifacts, the model can be deployed to an endpoint —
model = TensorFlowModel(
model_data="s3://{}/{}/{}/output/model.tar.gz".format(
s3_bucket,prefix,job_name),
role=get_execution_role(),
framework_version="2.0.0",
sagemaker_session=sagemaker.Session())predictor = model.deploy(initial_instance_count=1,
instance_type='ml.m5.large',
endpoint_name='sentence-encoder-test')print(predictor.endpoint_name)
The process takes some time to complete and it prints the endpoint name.
Once the model is deployed and we can make some predictions using Sagemaker Predictor —
predictor = Predictor(endpoint_name=predictor.endpoint_name,
sagemaker_session=sagemaker.Session(),
serializer=JSONSerializer(),
deserializer=JSONDeserializer())result = predictor.predict({'instances': ['hello world']})print(result)
This returns a JSON response with the encoded sentence values. This is means the model is successfully deployed.
In the end, delete the deployed instance to stop billing.
client = boto3.client('sagemaker')response = client.delete_endpoint(
EndpointName=predictor.endpoint_name)print(response)
That’s it!
In this way, you can deploy TensorFlow models to Sagemaker leveraging Sagemaker features. You can create a pipeline and automate this workflow so deploying the model in a different environment will be extremely easy.
References
https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html
https://www.tensorflow.org/tfx/serving/docker
https://www.tensorflow.org/api_docs/python/tf/saved_model/save