Harnessing Generative AI to Tackle Conversational AI Challenges

Jose Valdivia Leon
4 min readJun 21, 2023

--

Explore how Generative AI, particularly OpenAI’s ChatGPT, can address the challenges in Conversational AI (CAI), with a focus on enhancing voice bot interactions. Join us as we bridge tech and dialogue for a better user experience.

Intro

Kicking off a series of deep dives into tackling Conversational AI (CAI) challenges with Generative AI, we’ll be exploring how the current CAI landscape with customer interactions, intent detection, and integrations can be re-imagined. Existing systems, while undoubtedly cool, do come with a few hiccups like accuracy in intent recognition and complexities in testing automation. By tapping into Large Language Models (LLMs), we can see some of these challenges significantly reduced or even eliminated.

Our debut issue focuses on a persistent thorn in customer voice interactions: capturing non-standard words or phrases like proper nouns, jargon, emails, or neologisms. We’ll show how this can be eased by using LLMs, so let’s dive in.

Putting the Problem in Perspective

Voice chat bots have unique challenges that their text counterparts don’t. A significant one relates to the accuracy of the speech-to-text (STT) recognizer service when a customer’s utterance contains proper nouns, jargon, or differs in accent from the bot engine’s model.

Some CAI providers have adopted an omnichannel approach to navigate this. They prompt users to switch to another channel (like SMS or a Web App) for data input, ensuring the accuracy of captured information. While this does the job, it disrupts the user’s experience by forcing them to hop between channels — something a human agent wouldn’t do.

Ideally, a bot should capture data accurately as part of the conversation, just as customers are used to. Luckily, with the advent of LLMs like OpenAI’s ChatGPT, this is becoming possible. Services like ChatGPT can act as a virtual interviewer, requesting specific user information and outputting it in a specific format — all as part of a fluid dialogue. If the prompts are crafted with care, ChatGPT is a dream to work with. Let’s see how in the following section.

Generative AI: A Solution to Reckon With

In the CAI world, OpenAI’s ChatGPT is a potent tool, designed for interactive conversations with users. It can do more than just respond; it can engage in a dialogue that’s friendly, considerate, and goal-oriented when properly prompted. It goes beyond simple tit-for-tat, guiding users toward the chat’s objective — particularly data collection.

ChatGPT shows a remarkable understanding of user utterances, even those laden with typos or misspellings. It exemplifies how Generative AI, like ChatGPT, can tackle numerous CAI challenges, especially for voice bots where STT services often stumble.

Capturing proper nouns or emails during a voice bot conversation can be tricky, as STT services often struggle with non-common names or unusual spellings. The solution? Craft the right prompt for ChatGPT to engage in a dialogue with the user to capture, for instance, their surname. It can interact with the user to confirm the correct spelling, as illustrated below:

This interaction wouldn’t be necessary for a text bot, as users would likely write their names correctly. But it’s a game-changer for voice bots. Through the ChatGPT API, the CAI platform can bridge STT/ TTS services and ChatGPT, allowing voice dialogue until the correct spelling is achieved and the output is generated enclosed with triple “<>”, signaling the next step of the process.

Technical Implementation using Cognigy.AI

To bring this to life, we’ll use Cognigy.AI as our CAI platform. Here’s what we need to start:

The flow of logic is straightforward, as depicted in the diagram below:

The Cognigy.AI implementation is available as a package on GitHub and is up for grabs under the Apache 2.0 license. For testing using Cognigy.AI, you won’t need a Voice Gateway. Just use the built-in voice tester embedded in the interaction panel.

For deployment you need to follow these steps:

  1. Download the package from the GitHub repository.
  2. Open your agent or create a new one in your Cognigy instance.
  3. Go to “Manage -> Packaging” and import the package you have just downloaded.
  4. Open the imported flow (Medium — Capture variable over voice bot)
  5. Go to “Settings -> Default Context” and insert your Open AI Key
  6. In the “Chart” view, edit the “Go To” node in the flow and set the target to the “Wait for Input” node.
  7. Open the interaction panel and you are ready to go.

Wrapping it Up

In this compact exploration, we’ve seen how Generative AI tools, such as OpenAI’s ChatGPT, can be effectively used to tackle challenges in Conversational AI projects, specifically capturing proper nouns or jargon over voice. But that’s not all. There are countless other use cases that we’ll be diving into as this series unfolds.

If you’re keen on discussing this solution further or have a unique challenge in your Conversational AI project that you’d like to talk about, don’t hesitate to reach out.

Upcoming Topics

  • How to detect Answering Machines in outbound calls with LLMs
  • Advantages of Turn-key Conversational AI solutions for SMBs

Sources

  1. EvoAI GitHub Repository
  2. Top 10 Irish surnames that always get spelt wrong, ranked
  3. Cognigy.AI

--

--