Prompting Conversation — Exploring the Nuance Mix Builder (Copilot) Capabilities

Sam Bobo
8 min readMay 25, 2023

--

Imagined by Bing & DALL-E

Generative Artificial Intelligence has introduced a new computing paradigm infusing itself into the low/no code tooling space — prompt engineering — whereby transformer architectures powering large language models can take input commands with natural language as opposed to high level programming languages such as Python or R. Large Language models are notoriously powerful at generating text in a probabilistic manner to compose complete sentences on the subject matter at hand. Such text generation has extended to code creation, utilizing a codex, and unveiling new market solution such as GitHub Copilot.

Within Engineering, there is a concept called “Time to ‘Hello-World’” which is broadly defined at timed duration from account sign-up to a functioning and deployed “hello world” application. Microsoft sets a highly ambitious goal of 5 minutes, the larger outstanding question is, what is comprised of the hello-world? Historically, “Time to Hello World” guidance manifested itself in starter-packs, pre-loaded templates, or even guided walkthroughs. Typically applications are simplistic in nature and serve as a tutorial for the engineer for the tooling platform, until now…Enter GPT-4 and Codex.

Announced by Microsoft — GitHub Copilot is a Generative AI tool that allows developers to write prompts in the form of comments within an Integrated Development Environment (IDE) and output generated code. The value proposition resides in the ability for developers to spend less time writing boilerplate code and focus on solving the complex engineering problems.

This happens as GPT is trained on a Codex complete with examples of code scraped from major sites such as GitHub or StakedOverflow and can then, generate the code required.

Conversational AI systems require intricate conversational design, tactfully crafted intents, and creation of utterances to create delightful user experiences for end users, regardless of communication channel (chat, telephony, voice assistant, etc). Conversational Designers, as a result, are designers specializing in building dialog structures that aim to expedite the caller straight to the branch that solves their needs efficiently and without escalation. Many times, however, designers run into the blank canvas problem to start building the flow before its deployed into production and can leverage call flow analytics to understand what paths are working and where there exists attrition.

Recently, Nuance announced the concept of Mix Builder, a Conversational Design Co-pilot that effectively transforms natural language into a starter application. The backend takes a prompt, leverages a codex model trained on the proprietary dialog language and structure, and will transform the request into a dialog structure complete with tagged entities and intents, albeit, with low amounts of training utterances (another use case for Generative AI, but I digress).

I experimented with Mix Builder and seek to share my “time to hello world” experience.

DISCLAIMER: I am employed by Nuance :)

  1. Concept

Being a board game geek, I sought to create a simple bot for a fictitious board game cafe. The bot’s intended purpose would be twofold: (1) Book a table or seat at the board game cafe for game night, with required information such as name, party size, date, and time and (2) Provide board game recommendations based on genre, mechanics, number of players, and time duration.

2. Generating the Application

After creating an account on Nuance Mix, I navigated to the project creation tab and selected Conversational Builder button where I was presented with a screen allowing me to enter in a prompt, with helper text guiding me to describe the intended actions of the user within the conversation.

I entered:

Furnishing board game recommendations for users including genre, mechanics, time duration, and number of players. The bot can also reserve a table at a local board game shop and requires name, number of players, date, and time.

The text prompt was engineered to provide a level of detail required for the LLM

  1. Generate various dialog branches based on the intent matched on a generic opener
  2. Generate and pre-populate intents and associated entities to run a natural language understanding request against within the utterance inputted

After selecting the “Generate Dialog” box, a sample dialog was rendered on the screen:

Sample Dialog preview generated from text prompt on Nuance Mix Builder

What can be observed is that the dialog generated was exactly what the GPT-powered system was provided within the prompt and, should the information be mis-rendered due to hallucination, I had the option to edit in-dialog-bubble the exact phrasing I expected the bot to furnish to the user.

Next, seamlessly integrating Mix Answers into the application generation process, I was prompted to add external sources for fallback should the bot calculate a low confidence of intent matching due to out of band questions. Given that Board Game Geek contained a multitude of wiki-like pages and forums across the vastest array of published board games, I deemed it appropriate to include within the query scope. Should I actually have a website for my fictitious board game cafe, I could like that page to furnish answers such as “what are your hours of operaitons?” which my bot was not programmed to answer

Lastly, I selected “Next” to name my application and start the generation process. What impressed me the most about the wait screen and animation was the side panel of steps/actions the script was undergoing in order to craft a first draft of my bot. List entries included complex and tedious steps including generating entities, populating literals, generating sample utterances for a NLU, and training my model for me, all at the click of a button. This deliberately enumerated the value I was obtaining by utilizing the CoPilot.

While Speech Scientists and Conversation Designers may opt to build these from scratch, having a specific taxonomy and structure to optimize the intended application, it does, again, solve the blank canvas problem for lesser experienced designers or general app builders and save a tremendous order of magnitude of time when starting a bot project.

3. Explore the App

Within a few seconds, my newly minted application was generated. The Codex translated my prompt into the Dialog-specific language and imported the application directly into Mix, resulting in the starter application depicted below whereby the main intent is to reserve a table and each of the bottom horizontally aligned nodes map to a slot to fill and disambiguate against, all without the first stroke on my keyboard.

Nuance Mix Builder Generated Dialog in Mix.Dialog

The most impressive code generation aspect, in my opinion, was the audio-generation of intents, specifically the level of relative board-game knowledge the Large Language Model of GPT injected into the project. Two Intents were created, as a direct result of my prompt: (1) RECOMMEND_GAME and (2) RESERVE_TABLE with 54 and 32 sample utterances respectively.

Utterance samples within the RECOMMEND_GAME intent

Delving into the entity definition within the RECOMMEND_GAME intent, I noticed within the “mechanic” entity a list including deck building, dice rolling, and more. While my nerdiness extends deep into the realm of board games, I would certainly need to add other mechanics including worker placement, dungeon crawling, and more, however, the start and basic understanding of board games was extremely refreshing and reduced my time to deployment for sure.

Mechanics Entity and literals generated

4. Try the Bot in Action

Without additional training utterances, entity definition, or dialog branching, I decided to give the bot a try using the Nuance Mix Try Mode. Entering into the dialog, I was greeted with a fictitious board game shop name “XYZ Board Game Shop” and the conversation started with a generic opener as a prompt, searching for one of the two aforementioned intents. I chose the RECOMMEND_GAME intent. The bot proceeded to take me turn by turn to gather the appropriate information including type of game including type, mechanic, length, and number of players. While it was turn based instead of multi-slotted entity recognition, it still completed the intended function, again, with zero coding.

Try Mode Simulation of the Board Game Bot

In conclusion, Nuance Mix has integrated a vital and powerful aspects of Generative AI capabilities:

  1. Codex — translating an inputted prompt into generated code, specifically proprietary backend code that defines an entire bot framework, from the dialog logic to the entities and defined literals
  2. Example Generation — simulating a variety of utterances for a particular question to facilitate an expedited NLU intent training
  3. Agile Drafting — through intuitive interface creation, Mix creates a project within the low-no code environment and ability to edit in place for a developer to quickly made modifications to the generated content, add new content, and build towards a complete solution.
  4. Question Answering — incorporating Bing search queries as a bounded scope into a graceful fallback scenario to answer out of bound questions when a bot does not have a pre-trained answer within the scope of its capabilities

These capabilities encapsulate the essential most powerful aspects of Generative AI and associative LLM models while exposing them within the paradigm of low-no code tooling. Ultimately, the Mix Builder and Mix Answers solutions should aid in developers getting to a “hello world” application within 5 minutes, an extremely powerful value statement.

I look forward to building more apps with these new AI capabilities within Mix and sharing my results here!

--

--

Sam Bobo

Product Manager of Artificial Intelligence, Conversational AI, and Enterprise Transformation | Former IBM Watson | https://www.linkedin.com/in/sambobo/