Deploying a WKS Model into NLU: Whys and Wherefores

Reginald Raye
Nov 25, 2019 · 4 min read

By Reggie Raye and Sean Sodha

Watson Natural Language Understanding (NLU) is a powerful API that allows developers to analyze text input along dimensions such as categories, entities, relations, keywords, sentiment, and more. These text enrichments are all available “out of the box” to NLU users. Suppose, however, that you want your NLU API calls to return information with more granularity than comes standard — what then?

Lets look at an example text:

“Game of Thrones” took home the Emmy for outstanding drama 10 years after the show first aired. The win was proceeded by Peter Dinklage who took home his fourth Emmy for his role as Tyrion Lannister.”

When running the NLU Entities API call on this piece of text, the following entities are returned:

You can see that some relevant entities are detected but they are not very accurate e.g. “Thrones” as a Television Show, “Peter Dinklage” as a person, “Tyrion Lannister” as a person, etc.

How can we improve these results?

Watson Knowledge Studio (WKS), a companion NLP tool, provides a ready answer. The purpose of WKS is to output custom models that extract information from domain-specific text — for example, documents hailing from a technical sales department or an oncology center. Custom models consist of a set of entities and relations, collectively referred to as a type system. A type system, in turn, can be deployed to NLU, where it gives you the ability to enrich your text with unprecedented precision.

If your use case requires you to connect an entity like “unilateral phase detractor” to another entity like “Rockwell Retro Encabulator” via a relation like “isContainedIn”, a WKS custom model is the solution for you! A custom model with these entities and relations, deployed inside NLU, allows you to identify all such occurrences in your text.

OK, this is the perfect answer for you — what now?

First things first: build your WKS custom model. As your use case requires identifying entities and relations that are only apparent to subject matter experts (SME’s), SME’s are needed to annotate the desired entities and relations from representative documents. After enough annotations have been made, the model can be tested, refined as necessary, and deployed to NLU.

Entities (blue) and relations (grey) as identified by a WKS custom model

Here are the steps in a bit more detail:

  1. Based on a set of domain-specific source documents, the team creates a type system that defines entity types and relation types for the information of interest to the application that will use the model.
  2. A group of human annotators annotates a small set of source documents to label words that represent entity types, and to identify relation types where the text identifies relationships between entity mentions. Any inconsistencies in annotation are resolved, and one set of optimally annotated documents is built, which forms the ground truth.
  3. Watson Knowledge Studio uses the ground truth to train a model.
  4. The trained model is used to find entities, relations — plus coreferences — in new, never-seen-before documents.
Workflow for developing a machine learning custom model in WKS

Let’s say you’ve built your custom model, love its performance, and want to leverage its type system as part of the entity and relation enrichments in NLU. Here’s what comes next:

  1. WKS will give you a model id that you will need to save.
  2. The next step is to deploy the model from WKS to your NLU instance. This can easily be done within WKSs user interface.
  3. When making an API call with NLU, you can easily copy the model id into an entities parameter and it will automatically override the standard entity detection and instead search for your own entities.

Lets look back at the first example and try it with a custom model:

UI showing an NLU API call result

We can see improved accuracy in the detection of entities. Because of our new custom model, we were able to train Watson to learn that “Game of Thrones” is the full name of the television series, “Peter Dinklage” is an actor, and that “Tyrion Lannister” is a role and not a person. These subtle differences can make a world of a difference when it comes to analyzing ones textual data.

We’re not currently aware of any unsupervised modeling tool in the market — regardless of vector dimension or sequence transduction method — that can enrich text with this level of accuracy and domain-specificity. We hope you’ll give WKS and NLU a test drive to see just how well these companion tools can work for you.

Find out more about WKS and NLU and get started building on the IBM Cloud for free today!

IBM Watson

Reginald Raye

Written by

IBM Watson Knowledge Studio Product Manager

IBM Watson

AI Platform for the Enterprise

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade