How to Make Chatbot Orchestration Easier

There’s way more to building an enterprise-grade chatbot than just training data and response content

Published in

IBM watsonx Assistant

9 min readFeb 2, 2021

A few years ago now (wow! time flies), I wrote a blog post outlining the overall architecture for integrating your assistant with just about anything. Watson Assistant at the time, like most developer APIs, provided the natural language interaction, but it required a developer to build an application that connected to the end-user channel/experience, orchestrated back-end services, and handed off to live agents when all else failed.

Most of the time, that architecture looked something like this:

Orchestration App: the old architectural pattern for building a chatbot

Yes, that’s a big squiggly mess with a lot of very “custom” components that connect into a super fragmented customer care tech stack (which is common at most enterprise companies). This fragmented tech stack includes contact center tools, existing content, CRM tools, RPA tools, offline analysis tools, home-grown backend systems, and a bunch more — all of which play a critical role in delivering a personalized and efficient assistant experience, but typically exist in fairly un-coordinated silos. The boxes representing “glue” look nice and pretty in the diagram above, but make no mistake, they’re hiding more squiggly lines that only their creator can possibly comprehend. As a result, building an assistant a few years back was a lengthy and high-risk undertaking — definitely not for the faint of heart.

So… since about 2019, we set out on a mission to make it way easier for companies to build, launch, and maintain enterprise-grade assistants. We transformed Watson Assistant from the developer API seen above to a true end-to-end virtual assistant platform! This new, transformed architecture allows you to focus more on the content and personalizing the conversation rather than on building a large, complicated orchestration app. This cuts your time to value from months to weeks, or possibly down to a few days. We have already seen many major enterprises benefit from these changes, deploying assistants in less than 5 business days.

Our new approach to simplification attacks the critical points in the flow. It reduces complexity so that you only need to write the smallest possible amount of glue code where it counts.

Let’s look at the new and improved architecture:

Watson does the heavy lifting: the new architectural pattern for building an assistant

Much prettier, right? But of course, it always helps to have a walk-through of how the data flows.

The key points, by order of operation, are:

Channels: Web chat, telephony & messaging
Orchestration: Pre-message webhook callout
Task Completion: Task-specific webhook callouts
Handoffs: Existing content + human agents
Orchestration: Post-message webhook callout

The rest of this post will explore each piece of the architecture and how to customize it to suit any use cases or complex needs that might arise.

1 | Channels: Web chat, telephony & messaging

In 2020, we released the web chat integration, a front-end client that anyone can embed on their website with a simple copy/paste. However, this initial simplicity blocked a number of enterprises from adding the customizations they needed to build a world-class, personalized assistant. While the user experience of our web chat is certainly efficient and delightful to use, enterprises almost always have a need for customization.

Because of that, we opened up the ability to build custom response types, custom theming, and widgets within web chat to allow customers to create just about any custom experience, form factor, or brand image. Developers can also go a level deeper and integrate Web Chat with their underlying app — we make this possible by firing off client-side events that the underlying app can listen to. We’ll explore more on Web Chat’s extensibility in a future post.

In addition to Web Chat, we provide a number of other channel integrations that allow quick and easy deploys to the most prominent customer care channels — like telephone, WhatsApp, Facebook, SMS, and others. Many of these channels allow you some customization from within their own console, but most of the customization and orchestration for these channels is made possible from other extension points like the pre and post-message orchestration webhooks (below).

Finally, if you need your assistant to respond on a channel that we don’t directly support, we have a robust REST API that you can easily call from within any application.

2 & 5 | Orchestration — Pre & Post-message webhook callouts

With the recent update to webhooks in Watson Assistant, combined with the power and ease of use of our channel integrations, you can get started much faster and have a much more intuitive place to put all of the custom orchestration code that you still need to manage.

Orchestrate any external system with pre and post-message webhook callouts

Instead of building a large and complex orchestration app (see the first diagram in this blog post), mastering a bunch of integration APIs (including our own), and managing the state/history of the conversation, you can use our pre-message and post-message webhooks to intercept the input and output between your user and Watson Assistant.

Why would you want to do this? Let’s imagine you have users who speak multiple languages. You obviously don’t want to manage multiple assistants for each language — heck, you might not even speak them all yourself. You can use the pre-message webhook to intercept and translate from any language to the one native language you support, like English. Watson Assistant can “do its thing” in English, with a single set of training data, generate a response, and then you can intercept it again with the post-message webhook to translate back to your user’s native language.

In case the flow wasn’t clear, the pre-message webhook will allow you to intercept the user’s input on every turn of the conversation, manipulate it however you see fit, and then return a similar JSON payload back to Watson Assistant for analysis. On the flip side, you can also use the post-message webhook to intercept every response from the assistant and customize the JSON payload as needed before responding.

In addition to language translation, some other common use cases for the pre-message webhook include PII filtering/redaction, user_id management and aliasing, or adding context from other systems like a user’s basic information or location. Common use cases for the post-message webhook include PII replacement or pulling answers from a CMS*.

*Imagine you have a team of content writers, legal reviewers, different responses per channel or even want to hand write different responses per language; you might want to keep all of this written content in a separate content management system outside of Watson Assistant. Every response is associated with an ID, and combined with other variables your post message webhook could look up the actual response content to send to the user based on the key that Watson returns from dialog or actions.

3 | Task Completion: Task-specific webhook callouts

This type of orchestration comes from within your dialog (or actions) skills and is associated with a very specific, execution oriented thing that the assistant wants to do. Whereas pre- and post-message webhooks happen on every single turn of the conversation, a task-specific webhook callout is only called when it’s needed. This is ideal for transaction-related requests like paying a bill, submitting a new lead to a CRM, looking up account info, or checking a balance. You don’t need the user’s balance information on every single message, but when your user asks to check his/her balance, you have the ability to grab it via an API and return the exact info they’re looking for.

The assistant can handle tasks with task-specific webhook callouts

4 | Handoffs

Just like a regular employee, your assistant can’t know everything. That’s why conversation handoffs are a critical part of the Watson Assistant architecture. There are two key handoffs to consider when building an assistant, both of which we have optimized for.

4a | Your Existing Content

When an assistant is deciding how to respond to a given request, it will first check to see if the dialog/actions skills have an explicit response available. If not, the next thing it checks is your existing help content (which typically has a ton of knowledge available, but is fairly hard for users to find).

The Search Skill, more specifically, is the source of your existing content within Watson Assistant. It’s powered by Watson Discovery under the covers and crawls and indexes your existing content and can search against it to find the most concise answers possible given a query — all without needing any training.

The Search Skill crawls existing content and retrieves answers without being trained

The Search Skill is ideal when you have a website with a wealth of helpful information. You can simply point the web crawler at the site, tell it how often to sync, and the assistant will automatically stay up to date with your existing content. This way, if a user asks a question that you didn’t explicitly train your dialog to handle, Watson can fall back to the Search Skill to find a response from your help content. And in addition to a web crawler, there are a number of other content connectors available (including Box, Salesforce, and Sharepoint) that support a number of different file types for text extraction like PDF, Word, Excel, PPT, JPEG, JSON, and HTML.

4b | Human Agents

The last resort for an assistant when it has no other responses is to hand the conversation to a live agent service desk (in a web/messaging interaction) or to redirect the conversation* to a contact center tool over the phone. This can happen under a number of conditions: explicitly based on the question, when the assistant has no response at all, when it has given an insufficient answer too many times, or upon explicit request from the user to speak to someone. We have connectors to several service desks like Zendesk, Salesforce, Twilio Flex, and Genesys PureCloud, but once again, we know we can’t connect to everything. For this reason, we have released adapters to let you build your own service desk connector following our examples for service desks like Genesys and Twilio.

*See the vgwActTransfer command in the phone integration documentation. The phone integration relies on a SIP handoff back to your telephony system to transfer your users to your existing service agents. The vgwActTransfer contains all the parameters needed to guide your telephony system on which teams or destinations to transfer to.

As a last resort, the assistant can escalate to huan agents in a number of different tools

My Final Thoughts

All of the extensions and customizations discussed above should allow you to create solutions for just about any enterprise assistant use case while still allowing you to take advantage of all of our pre-built assets and integrations. We see lots of our customers already mixing and matching “out of the box” functionality with really advanced customization in certain areas of their own architectures —which allows for a really great synergy of simplicity + low cost of ownership with nearly infinite enterprise-level customization.

If you haven’t already, sign up for Watson Assistant to create your assistant, and if you have already signed up, join the IBM Watson Apps Community to continue the discussion with our experts and other Watson Assistant users.

Also, be sure to subscribe to this publication as our product experts will soon be posting follow-on deep dives into the specifics of each extension point with real-world use cases and sample code!