Serving Watson NLP Models on Amazon ECS

Published in

IBM Data Science in Practice

7 min readNov 11, 2022

Enjoy the performance and scalability of cloud, without giving up the versatility, convenience and sense of control of your local development environment.

Watson NLP Library for Embed helps develop enterprise-ready solutions through robust AI models, extensive language coverage and scalable container orchestration. Watson NLP runtime and models can be easily packaged into a single container image, and then deployed in any container environment, be it on Docker, a Kubernetes/OpenShift cluster or a serverless container service.

Amazon ECS is a fully managed, highly scalable container orchestration service, integrated with both AWS and third-party tools, such as Amazon Elastic Container Registry and Docker. AWS Fargate is a serverless, pay-as-you-go compute engine built into Amazon ECS. With AWS Fargate, you no longer have to manage servers or clusters of Amazon EC2 instances, handle capacity planning, or figure out how to isolate container workloads for security. Just define your application’s requirements, select Fargate as your launch type, and let Fargate take care of all the scaling and infrastructure management required to run your containers.

This blog will walk you through the steps to deploy a standalone Watson NLP Runtime on Amazon ECS with AWS Fargate.

Prerequisites

Ensure you have access to the IBM Entitled Registry
Ensure you have an AWS account
Install AWS CLI
Configure a default profile with a proper default region name
Download and install Docker Desktop

Tip:

If you don’t have an AWS account, you may want to consider AWS Free Tier.
Follow the security best practices for the root user of your AWS account, and create an admin user for daily use.
Make sure you have required permissions on AWS account to run applications on ECS.

Create a runtime container image

Watson NLP runtime and pretrained models are provided as container images in the IBM Entitled Registry. You need your entitlement key from the container software library to access those container images. Because the runtime image doesn’t include any models by default, you need to build a container image to include the models you want.

Step 1: Login to the IBM Entitled Registry

echo $IBM_ENTITLEMENT_KEY | docker login -u cp --password-stdin cp.icr.io

Step 2: Download a list of pretrained models to a local directory

Create a local directory named models:

mkdir models

Set a variable REGISTRY to pull the images from IBM Entitled Registry:

REGISTRY=cp.icr.io/cp/ai

Use a variable MODELS to provide the list of pretrained models you want:

MODELS="watson-nlp_syntax_izumo_lang_en_stock:1.0.6 watson-nlp_syntax_izumo_lang_fr_stock:1.0.6"

Copy the models into the local directory models:

for i in $MODELS
do
  image=${REGISTRY}/$i
  docker run -it --rm \
    -e ACCEPT_LICENSE=true \
    -v `pwd`/models:/app/models $image
done

Step 3: Create a Dockerfile using a text editor of your choice

ARG REGISTRY
ARG TAG=1.0.18
FROM ${REGISTRY}/watson-nlp-runtime:${TAG}
COPY models /app/models

Step 4: Build the container image

docker build . -t my-watson-nlp-runtime:latest --build-arg REGISTRY=${REGISTRY}

Upload the runtime container image to Amazon ECR

You need to put your runtime image into Amazon ECR, so that it can be used for deployment.

Step 5: Login to the default private registry

Each AWS account comes with a default private registry. Set an environment variable for the default private registry as appropriate:

export DEFAULT_REGISTRY=<your-12-digit-account-id>.dkr.ecr.<region>.amazonaws.com

aws ecr get-login-password | docker login --username AWS --password-stdin ${DEFAULT_REGISTRY}

Step 6: Create a repository in the default registry

aws ecr create-repository --repository-name my-watson-nlp-runtime

Step 7: Upload the runtime image

Tag the image:

docker tag my-watson-nlp-runtime:latest ${DEFAULT_REGISTRY}/my-watson-nlp-runtime:latest

Push the image:

docker push ${DEFAULT_REGISTRY}/my-watson-nlp-runtime:latest

Deploy the Watson NLP Runtime on Amazon ECS

There are several ways to deploy containerized applications on Amazon ECS. Other than manual deployment with AWS Management Console or AWS Command Line Interface (CLI), you can automate the deployment process with Infrastructure as Code (IaC) tools, such as AWS CloudFormation and Terraform, or one of the application development tools, such as AWS Copilot and AWS CDK. For those who are familiar with Docker and Docker Compose, the best option might be the Docker ECS integration, which allows users to:

Use native Docker commands to run applications on Amazon ECS
Simplify the development workflow of multi-container applications on Amazon ECS using Compose files
Switch between local and cloud deployment quickly and easily

Docker ECS integration transforms the Compose application standard into a collection of AWS resources, defined as an AWS CloudFormation template.

Step 8: Create an AWS ECS context in Docker

Create an Amazon ECS Docker context named myecscontext:

docker context create ecs myecscontext

Select an AWS profile, as shown in the example below:

$ docker context create ecs myecscontext
? Create a Docker context using: An existing AWS profile
? Select AWS Profile default
Successfully created ecs context "myecscontext"

Step 9: Create a Compose file

Create a Compose file named compose.yaml in the current directory as follows:

services:
  runtime:
    image: "${DEFAULT_REGISTRY}/my-watson-nlp-runtime:latest"
    environment:
      - ACCEPT_LICENSE=true
    deploy:
      x-aws-autoscaling: 
        min: 1
        max: 2 #required
        cpu: 75
        #mem: - mutually exclusive with cpu
      resources:
        limits:
          cpus: '2'
          memory: 4096M
    ports:
      - target: 8080
        x-aws-protocol: httpnetworks:
  default:x-aws-cloudformation:
  Resources:
    Runtime8080TargetGroup:
      Properties:
        HealthCheckPath: /swagger
        Matcher:
          HttpCode: 200-499

Notice the YAML block under x-aws-cloudformation:, which is an overlay that customizes the properties of the Runtime8080TargetGroup resource, for which no specific x-aws-* custom extension is available. An overlay is a YAML object containing attributes to be updated or added. It uses the same data structure as the CloudFormation template generated by ECS integration, and it will be merged with the generated template before being applied on the AWS infrastructure.

Step 10: Validate the Compose file

You can use the docker compose convert command to generate a CloudFormation stack file from your Compose file and inspect resources it defines. The command also checks the Compose file for syntax errors while generating the CloudFormation stack file.

docker --context myecscontext compose --project-name sample-project convert

Step 11: Deploy it on Amazon ECS

If everything looks good and you are ready to deploy the Watson NLP runtime, you can run the docker compose up command as follows:

docker --context myecscontext compose --project-name sample-project up

NOTE: It may take a few minutes for the deployment to complete.

Step 12: Check your deployment

It may take a few minutes for the deployment to complete. If the deployment is successful, you can list the deployed service with the docker compose ps command, which shows the hostname:port of the service endpoint of the deployed Watson NLP Runtime. You can access the Swagger UI in a browser at http://<hostname>:<port>/swagger to interact with the REST API resources provided by the Watson NLP Runtime.

docker --context myecscontext compose --project-name sample-project ps

You can also check the application logs with the docker compose logs command.

docker --context myecscontext compose --project-name sample-project logs

Step 13: Use curl to make a REST call

When the Watson NLP Runtime service is ready, you can send an inference request to the REST API endpoint using a curl command.

curl -s -X POST "http://${HOSTNAME}:${PORT}/v1/watson.runtime.nlp.v1/NlpService/SyntaxPredict" \
  -H "accept: application/json" \
  -H "grpc-metadata-mm-model-id: syntax_izumo_lang_en_stock" \
  -H "content-type: application/json" \
  -d "{ \"rawDocument\": { \"text\": \"This is a test.\" }, \"parsers\": [ \"TOKEN\" ]}" \
  | jq -r .

Tip:

The value of metadata grpc-metadata-mm-model-id should match the folder name of the model when it was downloaded and saved in ./models in Step 2.

If you get a response like the following, the Watson NLP Runtime is working properly.

Clean up

Don’t forget to clean up afterwards with the docker compose down command, to avoid paying for the cloud resources you no longer need.

docker --context myecscontext compose --project-name sample-project down

Conclusion

Now you’ve seen how easy it is to run a single-container Watson NLP runtime service with a set of pretrained models baked in, on Amazon ECS with AWS Fargate, or locally on Docker, using the same Compose files and Docker commands you are familiar with. You can also seamlessly switch between cloud and local deployment, whichever is more suitable for the task at hand.

To learn more about how to embed Watson NLP into your applications to provide enhanced user experience through powerful AI models, go to Embeddable AI on IBM Developer.

Not sure what you need or where to start? Try this guided wizard to find, try, train, use and embed IBM AI into your product.

IBM Business Partners can also browse the collection of Embeddable AI self-serve assets at IBM Technology Zone.

Serving Watson NLP Models on Amazon ECS

Prerequisites

Create a runtime container image

Step 1: Login to the IBM Entitled Registry

Step 2: Download a list of pretrained models to a local directory

Step 3: Create a Dockerfile using a text editor of your choice

Step 4: Build the container image

Upload the runtime container image to Amazon ECR

Step 5: Login to the default private registry

Step 6: Create a repository in the default registry

Step 7: Upload the runtime image

Deploy the Watson NLP Runtime on Amazon ECS

Step 8: Create an AWS ECS context in Docker

Step 9: Create a Compose file

Step 10: Validate the Compose file

Step 11: Deploy it on Amazon ECS

Step 12: Check your deployment

Step 13: Use curl to make a REST call

Clean up

Conclusion

Written by Kevin Huang