Chatbot Parts

Thomas Packer, Ph.D.
TP on CAI
Published in
3 min readNov 6, 2019

This story is a rough-draft. Check back later for the fully-polished story or post a comment telling me what you’d like me to research and write for you.

Here I describe the subsystems that come together to make a complete chatbot. And by “chatbot”, I mean any system built using conversational AI, including both text and voice based bots. Not every chatbot has every subsystem, but the approach taken by a chatbot for each subsystem, including whether it is present or absent, is a good way to distinguish between chatbots and to characterize their different capabilities.

Photo by Eric Prouzet on Unsplash

I list these subsystems in roughly the order they are used in a dialog cycle. A chatbot’s parts including the following:

Input Acceptance

Speech recognition, text input UI

Natural Language Understanding

Also called NLU, message interpretation. This part converts a statement from the user into a structured semantic representation, usually an intent (function) plus associated slot-value pairs (function parameters or arguments). Processes needed are intent recognition, entity recognition, and possibly entity resolution.

As an example, if a user asks “Can I book a room from April 1 to April 4?”, the chatbot should recognize the intent as “reserve hotel room” or “inquire if a hotel room is available for booking”. Associated slots of start data and end date should be filled with April 1 and April 4, respectively.

Knowledge Acquisition

Machine learning or data ingestion: we need some way to get knowledge or data into the chatbot. Machine learning would be more scalable than finding data to import and tweak to fit the needs and designs of the chatbot, which would be more scalable than manually entering everything. Using reinforcement learning to automatically refine the dialog manager strategy would be awesome but is currently considered by some to be impractical.

This piece may or may not fall outside the dialog loop, depending on the approach. It is also focused on improving the performance of the chatbot over a larger timescale than a single dialog.

Knowledge Representation

Conversation Context

Also called Dialog State Tracking, conversation context it the state of a conversation at any give moment, including what has been learned, said, or established so far in a dialog. This is sometimes grouped, along with Dialog Policy, as part of Dialog Management.

Who has “initiative” currently? This should be a hierarchy or stack data structure as one agent takes initiative temporarily to ask clarifying questions.

User intent, which can change if the user changes his mind, which must be assigned to a particular level on the initiative hierarchy. This would be cause context switching.

Goal Management

This may seem like part of the dialog management system, but as chatbots become more capable, they need to manage goals outside of the dialog — things such as fulfilling requests for the user.

Task Executor

The ability to execute tasks, or fulfill requests in computer systems or in the tangible world, for the user.

Dialog Management

Dialog Management or Dialog Policy is one of the most important chatbot parts. It consists of the data structures and algorithms that enable chatbots to understand the high-level features of multi-turn conversations or dialogs and to participate in them. These include the ability to maintain conversation context, to relate each human utterance to that context (e.g., whether the next statement is part of an existing question or a new one) and to assess whether the conversation is approaching a mutual goal or not.

This might include a dialog stack (a list of unresolved topics that have come up in the dialog so far) and an expectation agenda (things the chat bot expects to hear from the user).

Read more about dialog management.

Dialog Policy Learning

Natural Language Generation

Response Rendering

E.g. text to speech

--

--

Thomas Packer, Ph.D.
TP on CAI

I do data science (QU, NLP, conversational AI). I write applicable-allegorical fiction. I draw pictures. I have a PhD in computer science and I love my family.