Explaining a Chatbot Assistant at runtime with IBM Watson OpenScale — An end-to-end example

Published in

The Startup

7 min readOct 13, 2019

The ability to explain why a Chatbot Assistant understood a particular intent from a user can be helpful when you are looking to determine its’ performance in production.

In this post I’ll walk through an end-to-end example of a Node.js Chatbot Assistant application, complete with React.js UI, Express.js REST APIs, and TensorFlow.js NLP classification model. Then, I’ll demonstrate how to integrate the app with the IBM Watson OpenScale service in order to make use of the powerful Explainability feature, to better understand the predictions of the underlying intent classifier.

Before we dive in, let’s take a look at how the different parts fit together.

The System

The Node.js app uses Express to serve 3 endpoints that are consumed by the Front End, and by the Watson OpenScale service:

GET https://<host>/ - The base path exposes the Server-Side Rendered React.js UI.
GET https://<host>/v1/deployments - This API is used to discover the deployed assistant’s metadata; things like the message prediction endpoint, what kind of problem type is the underlying model solving (i.e. structured / unstructured, classifier / regression etc.).
POST https://<host>/v1/deployments/assistant/message - This API is used to send new messages to the assistant. The response body will include the reply, the predicted intent, and the raw probabilities from the classifier.

The last two APIs follow the Watson OpenScale Custom Serve Engine Specification that must to be adhered to in order for Watson OpenScale to both discover the assistant deployment, and also to be able to send messages to be classified.

The Model

For the purpose of this example, the ML model being used is a TensorFlow neural network multi-class classifier, trained to classify intents from encoded text using the Bag-of-Words technique. We will see later from the explanations why this approach could be improved.

For this example, the model has been trained to classify questions about my resume with 9 possible classes:

[‘education’, ‘experience’, ‘goodbye’, ‘greeting’, ‘hobbies’, ‘options’, ‘projects’, ‘skills’, ‘thanks’]

The following JSON snippet shows how both message patterns (used for model training and encoding new messages at runtime), and message responses (used by the app at runtime) can be labelled by intent in an easily consumable format:

The model creation is outside the scope of this post but feel free to take a look at the notebook here.

The trained model has been saved and converted to the appropriate format to be loaded by the TensorFlow.js library.

The Application

In this section, we will take a look at some of the key parts of the code that make up a single Node.js web application which serves both the React.js chatbot UI, and the intent classification model. The complete code for this example can be found here.

Routes

First, let’s take a look at the Express routes.

Here we can see the GET / route that is responsible for sending the rendered <App/> component as part of a text/html response. Take a look here to learn more about Server Side Rendering with React.

This route handles requests to GET /v1/deployments, responding with the metadata that Watson OpenScale requires to send messages to the underlying classifier.

Then, the route that handles requests to POST /v1/deployments/assistant/message will send the incoming payload to the assistantUtil.js module for classification, then send the message, the prediction, the probabilities of each intent, and a unique messageId to Watson OpenScale’s payload logging database.

We only want to log payloads for requests coming from the Front End of the application, as these are the predictions we are interested in explaining, so the UI will pass a logPayload=true query parameter flag. When explaining a classification, Watson OpenScale will invoke this endpoint with 5000 perturbed messages to be classified. As Watson OpenScale will not be setting the logPayload flag, these requests will not be sent to the payload logging database.

This application also prints the payload logging request body to the console log so that we can use the scoring_id value to generate explanations later.

Making Predictions

The key function in the assistantUtil module is the predict function:

The function creates a Bag-of-Words tensor which is then passed to the model which will return an array of probabilities, one for each possible classification (intent). The reply for the intent with the highest probability is retrieved, and the function returns an array that contains the reply, the predicted intent, and the set of probabilities.

Front End

This example chatbot UI makes use of the Progress Kendo React Conversational UI component as a quick way of getting a front end up and running.

The Assistant.js component maintains the messages sent from both the user and the bot in its’ state. When a user sends a message, it is added to the messages list, then the getMessage function invokes the intent classifier through thePOST /v1/deployments/assistant/message?logPayload=true API. The reply is then added to the messages list so that it is displayed in the browser.

Service Credentials Config

The application requires a number of configuration values and credentials to interact with Watson OpenScale.

The openscale.data_mart_id, openscale.subscription_id, openscale.binding_id values will be available once Watson OpenScale has discovered the deployment. The host should match the deployed application’s host name.

Integration

In this section we will look at configuring the Watson OpenScale service to discover the application.

The application can be deployed as a Cloud Foundry application to IBM Cloud with the command ibmcloud cf push. A new instance of the Watson OpenScale service can be created from the IBM Cloud Catalog.

Creating a Custom Machine Learning Provider

From the Watson OpenScale System Setup page, create a new Machine Learning Provider, selecting the Custom Environment tile and providing the base url of the deployed app. If your API requires basic auth, you can provide the username and password in this form. If not, simply add dummy values and click Save. The service binding_id value can be found in the Provider Summary.

Adding a new Model Monitor

Navigate to the Dashboard, select the Model Monitors tab, and add a new monitor to the dashboard.

Make sure to select the new Custom Provider and then select the deployment for the assistant application that has been discovered. Clicking Configure Monitors at the end of the wizard will display the rest of the details (subscription_id, deployment_id) required for the application to successfully begin logging payloads.

At this point, update these values in the config.json file and redeploy the application.

Configuring Payload Logging and Model Details

To complete the Payload Logging configuration, open the deployed application and send a message to the chatbot. The application will then log the payload of the message and the predicted intent.

The Model Details tab should then be enabled, where we can tell Watson OpenScale which fields returned by the application should be used as the prediction column, and which should be used as the probabilities column. Once these have been set, the Explainability Monitor will be enabled and the configuration is complete.

Explainability

Now that some payloads have been logged, lets retrieve the messageId for a particular message from the logs and generate an explanation.

$ ibmcloud cf logs damian-assistant-app --recent | grep "scoring_id"
   2019-10-10T14:06:02.41+0100 [APP/PROC/WEB/0] OUT   scoring_id: '88ef29f6-28ef-42a0-b587-9383920abb0f',
   2019-10-10T14:06:19.35+0100 [APP/PROC/WEB/0] OUT   scoring_id: 'ac2d6e98-bd46-491f-8353-8019326bd37e',
   2019-10-10T14:06:26.60+0100 [APP/PROC/WEB/0] OUT   scoring_id: '963b93d9-2b45-49b6-b498-dbcab02ef235',
   2019-10-10T14:06:32.67+0100 [APP/PROC/WEB/0] OUT   scoring_id: '6939cb56-f916-4fdd-9da3-8e461093c11a',

Copying and pasting one of the Scoring IDs into the Explanation search input kicks off a new Explanation job. Note the added suffix of-1. This is necessary because Watson OpenScale supports logging multiple payloads in a single payload logging request and the suffixes are added to the Scoring ID to uniquely identify each payload.

The Explanation shows the confidence the model has that the message is conveying a particular intent (in this case education), and also shows the feature importance. This is in fact a local interpretation of the model, reflecting the behaviour of the classifier “around” the instance being predicted.

Interestingly, the model appears to be placing more importance on words like a, you, and do, than on words like degree, which may indicate that the model is overfitting on meaningless words from the training data and is therefore less likely to perform well in production. This may be down to the use of the Bag-of-Words approach which could be improved by using a more sophisticated word embedding technique.

Conclusion

In this post we have covered how a Chatbot Assistant Node.js application can be built around an unstructured text classification model, how to configure Watson OpenScale to discover and monitor the application, and how to generate explanations of individual messages.