Serving Watson NLP Models on Amazon ECS
Enjoy the performance and scalability of cloud, without giving up the versatility, convenience and sense of control of your local development environment.
Watson NLP Library for Embed helps develop enterprise-ready solutions through robust AI models, extensive language coverage and scalable container orchestration. Watson NLP runtime and models can be easily packaged into a single container image, and then deployed in any container environment, be it on Docker, a Kubernetes/OpenShift cluster or a serverless container service.
Amazon ECS is a fully managed, highly scalable container orchestration service, integrated with both AWS and third-party tools, such as Amazon Elastic Container Registry and Docker. AWS Fargate is a serverless, pay-as-you-go compute engine built into Amazon ECS. With AWS Fargate, you no longer have to manage servers or clusters of Amazon EC2 instances, handle capacity planning, or figure out how to isolate container workloads for security. Just define your application’s requirements, select Fargate as your launch type, and let Fargate take care of all the scaling and infrastructure management required to run your containers.
This blog will walk you through the steps to deploy a standalone Watson NLP Runtime on Amazon ECS with AWS Fargate.
Prerequisites
- Ensure you have access to the IBM Entitled Registry
- Ensure you have an AWS account
- Install AWS CLI
- Configure a default profile with a proper default region name
- Download and install Docker Desktop
Tip:
- If you don’t have an AWS account, you may want to consider AWS Free Tier.
- Follow the security best practices for the root user of your AWS account, and create an admin user for daily use.
- Make sure you have required permissions on AWS account to run applications on ECS.
Create a runtime container image
Watson NLP runtime and pretrained models are provided as container images in the IBM Entitled Registry. You need your entitlement key from the container software library to access those container images. Because the runtime image doesn’t include any models by default, you need to build a container image to include the models you want.
Step 1: Login to the IBM Entitled Registry
echo $IBM_ENTITLEMENT_KEY | docker login -u cp --password-stdin cp.icr.io
Step 2: Download a list of pretrained models to a local directory
Create a local directory named models
:
mkdir models
Set a variable REGISTRY
to pull the images from IBM Entitled Registry:
REGISTRY=cp.icr.io/cp/ai
Use a variable MODELS
to provide the list of pretrained models you want:
MODELS="watson-nlp_syntax_izumo_lang_en_stock:1.0.6 watson-nlp_syntax_izumo_lang_fr_stock:1.0.6"
Copy the models into the local directory models
:
for i in $MODELS
do
image=${REGISTRY}/$i
docker run -it --rm \
-e ACCEPT_LICENSE=true \
-v `pwd`/models:/app/models $image
done
Step 3: Create a Dockerfile using a text editor of your choice
ARG REGISTRY
ARG TAG=1.0.18
FROM ${REGISTRY}/watson-nlp-runtime:${TAG}
COPY models /app/models
Step 4: Build the container image
docker build . -t my-watson-nlp-runtime:latest --build-arg REGISTRY=${REGISTRY}
Upload the runtime container image to Amazon ECR
You need to put your runtime image into Amazon ECR, so that it can be used for deployment.
Step 5: Login to the default private registry
Each AWS account comes with a default private registry. Set an environment variable for the default private registry as appropriate:
export DEFAULT_REGISTRY=<your-12-digit-account-id>.dkr.ecr.<region>.amazonaws.com
Login to the default private registry:
aws ecr get-login-password | docker login --username AWS --password-stdin ${DEFAULT_REGISTRY}
Step 6: Create a repository in the default registry
aws ecr create-repository --repository-name my-watson-nlp-runtime
Step 7: Upload the runtime image
Tag the image:
docker tag my-watson-nlp-runtime:latest ${DEFAULT_REGISTRY}/my-watson-nlp-runtime:latest
Push the image:
docker push ${DEFAULT_REGISTRY}/my-watson-nlp-runtime:latest
Deploy the Watson NLP Runtime on Amazon ECS
There are several ways to deploy containerized applications on Amazon ECS. Other than manual deployment with AWS Management Console or AWS Command Line Interface (CLI), you can automate the deployment process with Infrastructure as Code (IaC) tools, such as AWS CloudFormation and Terraform, or one of the application development tools, such as AWS Copilot and AWS CDK. For those who are familiar with Docker and Docker Compose, the best option might be the Docker ECS integration, which allows users to:
- Use native Docker commands to run applications on Amazon ECS
- Simplify the development workflow of multi-container applications on Amazon ECS using Compose files
- Switch between local and cloud deployment quickly and easily
Docker ECS integration transforms the Compose application standard into a collection of AWS resources, defined as an AWS CloudFormation template.
Step 8: Create an AWS ECS context in Docker
Create an Amazon ECS Docker context named myecscontext
:
docker context create ecs myecscontext
Select an AWS profile, as shown in the example below:
$ docker context create ecs myecscontext
? Create a Docker context using: An existing AWS profile
? Select AWS Profile default
Successfully created ecs context "myecscontext"
Step 9: Create a Compose file
Create a Compose file named compose.yaml
in the current directory as follows:
services:
runtime:
image: "${DEFAULT_REGISTRY}/my-watson-nlp-runtime:latest"
environment:
- ACCEPT_LICENSE=true
deploy:
x-aws-autoscaling:
min: 1
max: 2 #required
cpu: 75
#mem: - mutually exclusive with cpu
resources:
limits:
cpus: '2'
memory: 4096M
ports:
- target: 8080
x-aws-protocol: httpnetworks:
default:x-aws-cloudformation:
Resources:
Runtime8080TargetGroup:
Properties:
HealthCheckPath: /swagger
Matcher:
HttpCode: 200-499
Notice the YAML block under x-aws-cloudformation:
, which is an overlay that customizes the properties of the Runtime8080TargetGroup
resource, for which no specific x-aws-*
custom extension is available. An overlay is a YAML object containing attributes to be updated or added. It uses the same data structure as the CloudFormation template generated by ECS integration, and it will be merged with the generated template before being applied on the AWS infrastructure.
Step 10: Validate the Compose file
You can use the docker compose convert
command to generate a CloudFormation stack file from your Compose file and inspect resources it defines. The command also checks the Compose file for syntax errors while generating the CloudFormation stack file.
docker --context myecscontext compose --project-name sample-project convert
Step 11: Deploy it on Amazon ECS
If everything looks good and you are ready to deploy the Watson NLP runtime, you can run the docker compose up
command as follows:
docker --context myecscontext compose --project-name sample-project up
NOTE: It may take a few minutes for the deployment to complete.
Step 12: Check your deployment
It may take a few minutes for the deployment to complete. If the deployment is successful, you can list the deployed service with the docker compose ps
command, which shows the hostname:port
of the service endpoint of the deployed Watson NLP Runtime. You can access the Swagger UI in a browser at http://<hostname>:<port>/swagger
to interact with the REST API resources provided by the Watson NLP Runtime.
docker --context myecscontext compose --project-name sample-project ps
You can also check the application logs with the docker compose logs
command.
docker --context myecscontext compose --project-name sample-project logs
Step 13: Use curl to make a REST call
When the Watson NLP Runtime service is ready, you can send an inference request to the REST API endpoint using a curl
command.
curl -s -X POST "http://${HOSTNAME}:${PORT}/v1/watson.runtime.nlp.v1/NlpService/SyntaxPredict" \
-H "accept: application/json" \
-H "grpc-metadata-mm-model-id: syntax_izumo_lang_en_stock" \
-H "content-type: application/json" \
-d "{ \"rawDocument\": { \"text\": \"This is a test.\" }, \"parsers\": [ \"TOKEN\" ]}" \
| jq -r .
Tip:
- The value of metadata
grpc-metadata-mm-model-id
should match the folder name of the model when it was downloaded and saved in./models
in Step 2.
If you get a response like the following, the Watson NLP Runtime is working properly.
Clean up
Don’t forget to clean up afterwards with the docker compose down
command, to avoid paying for the cloud resources you no longer need.
docker --context myecscontext compose --project-name sample-project down
Conclusion
Now you’ve seen how easy it is to run a single-container Watson NLP runtime service with a set of pretrained models baked in, on Amazon ECS with AWS Fargate, or locally on Docker, using the same Compose files and Docker commands you are familiar with. You can also seamlessly switch between cloud and local deployment, whichever is more suitable for the task at hand.
To learn more about how to embed Watson NLP into your applications to provide enhanced user experience through powerful AI models, go to Embeddable AI on IBM Developer.
Not sure what you need or where to start? Try this guided wizard to find, try, train, use and embed IBM AI into your product.
IBM Business Partners can also browse the collection of Embeddable AI self-serve assets at IBM Technology Zone.