Introduction to Conversational AI

Adesh Bansode
Subex AI Labs
Published in
3 min readApr 20, 2021

One of the amazing inventions of AI is ‘Conversational Artificial Intelligence or Conversational AI’ which is the set of technologies behind automated messaging and speech-enabled applications that offer human-like interactions between computers and humans.

Conversational AI is any machine that a person can talk to, and is usually engaged with today through chatbots and voice assistants.

Component of Conversational AI

Conversational AI is any machine that a person can talk to. This could be a chatbot on a website or social messaging app, a voice assistant or voice-enabled device, or any other interactive messaging-enabled interface. These solutions allow people to ask questions, get opinions or recommendations, execute transactions, find support or otherwise achieve a context-dependent goal through conversation.

Conversational AI brings together five technology components

  1. Automatic Speech Recognition (ASR)
  2. Natural Language Understanding (NLU)
  3. Dialogue Management
  4. Natural Language Generation (NLG)
  5. Text to Speech (TTS)

Automatic Speech Recognition (ASR) :

Example of ASR

Computer-based processing and identification of human voice are known as Speech Recognition (Automatic Speech Recognition). It is the process of converting a speech signal to a sequence of words, by means of an algorithm implemented as a computer program. Its role in conversational AI is to convert speech to text.

Natural Language Understanding (NLU) :

It is a branch of the Natural Language Process (NLP), which involves transforming human language into a machine-readable format. NLU focuses on a machine’s ability to understand human language. NLU refers to how unstructured data is rearranged so that machines may “understand” and analyze it.

NLP and NLU by Stanford NLP Group

Dialogue Management (DM):

Dialogue Management is the central component of Conversational AI, which takes input from the ASR and NLU system, interacts with external knowledge sources, produces messages to be output to the user. The dialogue management process has two main tasks :

  • Dialogue modelling : Keeping track of the state of the dialogue.
  • Dialogue control: Making decisions about the next system action.

Generally, it controls the dialogue flow between agent and user.

Natural Language Generation (NLG) :

It is a branch of the Natural Language Process (NLP), which is the “process of producing meaningful phrases and sentences in the form of natural language.” It automatically generates narratives that describe, summarize or explain input structured data in a human-like manner.

Text-to-Speech (TTS) :

It is the last stage of Conversational AI. A text-to-speech (TTS) system converts the text response generated by NLU and NLG stage and changing it to natural-sounding speech. It works exactly opposite to ASR system. Below figure shows TTS architecture :

TTS system by Facebook AI
Photo by Kelly Sikkema on Unsplash

With advancing technologies and computational power, Conversational AI may be the future in various day-to-day life activities. Conversational AI will positively benefit customer satisfaction because of the quick answers it provides.

