Watson Assistant API: From V1 to V2

The new Watson Assistant experience is powered by a redeveloped API. This article outlines what’s changed and what hasn’t in the evolution from V1 to V2.

Andy Stoneberg
IBM watsonx Assistant
13 min readOct 13, 2022

--

© IBM 2022

Watson Assistant is leaning hard into the new experience, and a key element of the new information architecture is the V2 API. Lots of exciting new features to manage your assistant’s lifecycle, and bring value to your clients, are available through the new experience and the V2 API.

If you find yourself stuck using the V1 API because you fear what changes lie lurking in the new experience, I’m going to, hopefully, illustrate how to leverage the V2 API in your application and give you the confidence to make the switch!

The V1 API

Let’s review how to get up and running with the V1 API. We will step through, in painstaking detail, what requests and responses are involved in creating your workspace, ensuring the natural language understanding (NLU) models have been successfully trained, and leveraging the models to classify utterances provided by a user. That journey looks like the following:

Simple V1 API Workflow

A workspace is the fundamental building block of the V1 API. All training data used in intent classification and entity detection is provided as part of a workspace.

To get started, a user must create a workspace with relevant data.

POST v1/workspaces
  • From the above screenshot, we can see a workspace was created with a workspace_id value of 3c9f7e53-a225–47f0–89c9–24ac5e041098

Once a workspace is created, the system will begin to train a set of machine learning (ML) and NLU models based on the included training data. Before the /message endpoint can be used, we want to ensure the NLU models are Available.

GET v1/workspaces/{workspace_id}

Once the NLU models areAvailable, we are ready to invoke the /message API. A V1 /message API request is composed of three types of information:

  • input
  • context
  • user_id

input determines how the system processes the request. It contains the “utterance” of the user (input.text) plus optional flags that influence the processing or response of the API call.

context is additional information that can be associated with the multi-turn conversation throughout its duration and be leveraged by client applications. It does NOT influence the behavior of Watson Assistant when processing a request.

user_id is a unique identifier for a given user interacting with your assistant. While the system provides a randomly generated string if this value is not specified, it is a good idea for your system to identify users. The majority of Watson Assistant plans are billed based on “users” as identified by this attribute — so relying on randomly generated identifiers can lead to unnecessary charges to your IBM Cloud account!

For an initial /message API request, it’s likely context is not very interesting.

POST v1/workspaces/{workspace_id}/message

The response for the above request contains a lot more information because of the system processing the message.

To help understand what is being returned, let’s break the response into two groups: everything except context and context.

First, the non-context information:

POST v1/workspaces/{workspace_id}/message Response
  • intents is the collection of intents classified based on the user’s utterance in reference to the trained NLU models
  • entities is the collection of entities detected based on the user’s utterance in reference to the trained NLU models
  • input is merely a reflection of what was provided in the request
    - The spellcheck feature can augment this attribute to provide additional information
  • output is constructed based on evaluating the intents/entities/context against the dialog_nodes provided in the training data of the workspace. This attribute then provides the information around how a bot should respond to an end user
  • user_id reflects the value provided in the request (or a system generated value if no value was provided)

Now, the context information:

POST v1/workspaces/{workspace_id}/message Response Context
  • conversation_id, system, and metadata are important for the duration of a given user’s multi-turn conversation — and typically should not be modified
  • init / greeted / a_null_var are user_defined context variables configured in a dialog_node of the workspace to populate. In V1 /message, user_defined attributes can appear anywhere in context
  • something_arbitrary is the user_defined context variable that was specified in the body of the request

Based on the example above, any subsequent /message API request needs to “carry forward” the context. While user defined attributes can be modified, conversation_id, system, and metadata should be preserved.

POST v1/workspaces/{workspace_id}/message Subsequent Request

This propagation of context into a subsequent /message API request is what we refer to as V1 /message being stateless. The system does not remember any stateful information from one /message API request to the next, meaning all the necessary information must be provided by the client to advance a multi-turn conversation.

The response for the above request contains information similar to what we saw on the initial request, but relevant to the specific point of the conversation based on the dialog_nodes configuration.

Again, we will first look at the non-context information:

POST v1/workspaces/{workspace_id}/message Subsequent Response
  • The explanation of the above attributes is identical to the analysis of the initial /message API invocation:
    - The actions attribute present in this response is due to a Dialog Callout I have configured on the dialog_node that handled this request
    - ℹ️ Note: this actions attribute is completely unrelated to the actions skill that will be discussed in the following section

And now, context.

POST v1/workspaces/{workspace_id}/message Subsequent Response
  • As before, the explanation of the above attributes is identical to the analysis of initial /message API invocation

Congratulations, you are now intimately familiar with the structure of the V1 /message API!

Let’s “up the ante” and take a similarly analytical approach to the V2 API.

The V2 API

Building on the prior section, a comparable V2 workflow has an additional step. First, an assistant is created (which automatically creates a draft and liveenvironment), then an actions skill.

Then, we are on more familiar footing: the training status needs to be checked to confirm various NLU models are available before user utterances can be analyzed. Again, for the “picture people” out there, the following diagram illustrates what we are going to work through.

Simple V2 API Workflow

While in V1 a workspace was the fundamental building block of the API, the V2 API has been designed to handle more complex use cases. The V2 API works seamlessly over a variety of channels, integrations, and customer-defined webhooks. With extensibility a design point, the API can more easily adapt to future functional enhancements with minimal effort required for customers to adopt.

In V2, the fundamental building blocks are skills and environments.

  • A skill is a resource that represents a particular runtime capability of Watson Assistant. Today we support skill types of dialog, actions, and search:
    - The dialog skill is the spiritual successor to the workspace
    - The actions skill is the evolution of the dialog skill
    - The search skill integrates Watson Discovery into the V2 /message API
  • An environment is a resource that organizes and orchestrates the invocation of skills

The assistant is then the container for skills, environments, and other resources that help create and manage a production assistant. Again, the assistant does not orchestrate messages — that role is served by an environment.

Assistants define the scope in which a skill may be accessed. For skills to be invoked via /message, a skill must be assigned to an environment. This relationship between a skill and an environment is referred to in the API as a skill reference.

To get started in V2, a user need only create an assistant. Once the assistant is created, the system will automatically :

  • create 2 environments
    - draft: automatically updates as changes are made to any skills
    - live: only updates via explicit user action
  • create (empty) actions and search skills defined on the assistant
  • assign the actions skill to the draft environment

While all this might seem like information overload, getting up and running has never been easier through the New Watson Assistant experience! As we dive into what this looks like from an API experience to compare and contrast with the V1 perspective, please note not all the APIs I reference here are publicly available (yet). For those that are not publicly available, the Watson Assistant Tooling serves those use cases well.

ℹ️ The following use case, for simplicity, will only leverage the draft environment. Working amongst different environments within an assistant will be covered in a later article.

First, we create an assistant:

  • Again, the creation of the assistant does a lot of heavy lifting in the background creating and setting up additional resources
POST v2/assistants (⚠️ private api subject to change⚠️)
  • There is a lot of information embedded in this response, but it’s out of scope to the task at hand. I hope at some point in the future dive into these details in much greater detail. For now, the relevant information to note is the assistant_id value of d27fb9ca-ae10–4f39-b36e-db467fbe58f8 and the draft environment_id value of 66ab8757-d410–41d2–9ce0–00e7e11e51bd

With an assistant created, we can now add a dialog skill to do a more apples-to-apples comparison to the V1 scenario above.

While the New Watson Assistant experience prefers the actions skill as the training data container, a dialog skill can be added alongside the actions skill to help support established systems. When a dialog skill is active on the assistant, it acts the primary skill and can delegate to the actions skill where appropriate.

POST v2/assistants/{assistant_id}/skills (⚠️ private api subject to change⚠️)
  • From the above screenshot, we can see a skill was created with a skill_id value of b83e1306–4550–46c3–9ed7–4e5b1ed6b973
  • Notice the request payload has a workspace attribute that includes training data in the same format we observed for a V1 workspace
  • The status attribute present at the root of the response reflects the state of the skill. In V2, in order to more reliably handle larger payloads, the API can work asynchronously. The root level status attribute indicates if the asynchronous processing of the skill is complete

As with V1, you want to make sure your skill has finished training before invoking the V2 /message API. While the Watson Assistant Tooling provides a beautiful banner to keep you informed, a peek behind the curtain from an API standpoint looks like this:

POST v2/assistants/{assistant_id}/skills/{skill_id} (⚠️ private api subject to change⚠️)
  • The status attribute embedded under the workspace_reference property is analogous to the status attribute from V1

With our assistant now configured with a properly trained dialog skill, we are ready to invoke the V2 /message API! To appreciate the differences in V2, we will replicate the same scenario we played out for the V1 API analysis.

Before we start to to dive into JSON payloads again, I want to briefly call out another important difference in the V2 API. The /message API in V2 comes in 2 different varieties:

Please note the following examples will leverage the stateless variant of the V2 /message API. Now, lets starts to see what this looks like in practice by sending our initial call!

⚠️ One last explanatory interlude:

In the following screenshots, a discerning reader may notice and wonder why we are providing the environment_id value in the v2/assistants/{id}/message endpoint.

This is supported in order to allow existing customers to adopt the the new Watson Assistant experience with limited changes. In time, we will be rolling out a POST v2/assistants/{assistant_id}/environments/{environment_id}/message endpoint that more naturally maps to the new data model.

POST v2/assistants/{environment}/message
  • input closely follows its V1 equivalent, except now a message_type: text attribute is included. Today, text is the only support value for message_type, but leaves open the ability to support non-textual input in the future
  • context is more expressive now in order to differentiate across multiple skills. The skills attribute encompasses all context across any skill associated with the assistant. Value attributes for the skills object are:
    - main skill (dialog skill)
    - actions skill (actions skill)
    - search skill (search skill)
    ℹ️ Yes, I do regret and humbly apologize for the whitespace included in those attribute names!
  • With a given skills child object, any attributes you wish to provide to the system should appear under the user_defined attribute. This additional level of nesting ensures new features or attributes introduced by the Watson Assistant development team won’t interfere with behavior you rely upon!
  • user_id behaves in the same manner as V1.

Let’s dive into the response. Again, as we did with V1, let’s break the response into two groups: everything except context and context.

POST v2/assistants/{environment_id}/message Response

With the V1 response in mind, there are a few key differences to call out here:

  • The intents collection is now nested under output
  • The entities collection is now nested under output
  • Both intents and entities have a skill attribute as a result of configuring the assistant with a dialog skill. The skill attribute has a value aligned with the context.skills child attributes discussed as part of the V2 /message request payload. This attribute informs you, in more complex cases when a dialog skill delegates to an actions skill, how the system is interpreting the request
  • The input element is no longer “echoed back” in the response

Next, the context information:

POST v2/assistants/{environment_id}/message Response

This is a significant departure from the V1 context but enables greater extensibility and coherency. Let’s look into what is going on here:

  • context is divided into 2 categories: global and skills
  • context.global.system acts much like a system attribute in V1. Namely, it’s not meant to be messed with. To further reinforce the importance of not modifying this information, the majority of it is encoded. Values rendered readably provide necessary information to facilitate debugging but still should not be modified
  • context.global.system.session_id serves the same purpose as context.conversation_id in the V1 API
  • skills.main skill.user_defined clearly captures the values provided by the client, both from the dialog skill definition and any additional context defined in the request
  • skills.main skill.system acts like you can probably expect given it’s named system: it’s internal data leveraged by the Watson Assistant service that is not intended to be modified

As with the V1 API, any subsequent /message API request needs to “carry forward” the context. While context.skills.* skill.user_defined attributes can be modified, context.global.system and context.skills.* skill.system values should be preserved.

POST v2/assistants/{environment_id}/message Subsequent Request

This example, in order to align with what was outlined in the V1 API, leverages the V2 stateless /message API. As such, propagation of context into a subsequent /message API request is still necessary. Out of scope to this article, there is also a stateful variant of the /message API available in V2.

Once again, we will first look at the non-context information of the response:

POST v2/assistants/{environment_id}/message Subsequent Response

The explanation of the above attributes is identical to the analysis of initial V2 /message API invocation. As with V1, the actions attribute present in the output attribute of the response is due to a Dialog Callout I have configured on the dialog_node of the dialog skill that handled this request.

And, lastly, the context information.

POST v2/assistants/{environment_id}/message Subsequent Response

No real surprises here. The explanation of the above is identical to the analysis of initial V2 /message API invocation.

Pat yourself on the back. That was an intensive API overview.

Let’s summarize everything we have discussed and highlight some additional differences not covered in the trivial example above.

Key Differences

Following the Pareto Principle, it is not unreasonable to assume this review of the differences between V1 and V2 API behavior covers 80% of the use cases a client would be interested in. That said, failure to appreciate differences on (API) paths less travelled can still result in a very frustrating debugging experience for a Watson Assistant customer, and potentially an unhappy user of that customer’s assistant.

In the interest of complete disclosure, the following tables outline all the differences you can expect on request and response payloads:

/message schema differences

API Versions

Until now, we have focused on the major API version differences. However, there is an additional piece to this puzzle: the minor API version. While minor versions are “tweaks” (whereas major versions are typically complete overhauls), failure to account for minor API version differences can still break your application!

It’s not far-fetched to think, for an established customer that has been happy with Watson Assistant, that a given client application is not using the latest supported minor API version. While minor API version releases are always documented in the IBM Watson Assistant Release Notes, there is a lot of other information unrelated to the API that is documented. This makes it sometimes tricky to identify what you should be aware of when upgrading from an old minor version.

Lucky for you, I have done all the necessary scrolling and squinting required to summarize the relevant API changes below! Changes come in a couple different flavors:

  • Schema changes
    - request/response attributes outright removed, or with changes to data type
  • Data type changes
    - specific type of schema change that only involves the data type of the attribute changing
  • Behavioral changes
    - request/response schemas unchanged, but response content could be different

I have tried to explicitly call out what type of changes occur in each API version in the table below.
- The Release Notes column is a link to the official Watson Assistant Release Notes for the given API version.

Supported API versions summary table

--

--