Lessons Learned

How to Make Your Voice Assistant React Properly to Silence?

Implementing Silence in Voice-First Technology

Published in

PromethistAI

5 min readApr 19, 2021

As a conversation designer of complex mental health applications, I often have to take up challenges that are on the edge of the possibilities of current technology. One of the fairly frequent issues encountered in voice-first application development is that voice assistants are not able to react suitably when the user doesn’t say anything.

The Problem with Delayed Reactions

The problem occurs predominantly in cases where voice assistants ask questions that are open and the users hesitate as they start thinking about how to answer, creating a delay which results in the assistant ending the conversation.

However, this can be experienced with easier, limited questions as well. I think that everyone who uses Alexa knows this from their own experience:

ME: “I want to order a pizza.”

ALEXA: “And what pizza do you want?”

At this moment, I realize that I should perhaps order two pizzas, one for me and one for my friend who is coming to see me tonight, so I pause for a while…

→ END

With strictly delimited and goal-oriented conversations like this one, this may not be a frequent issue. Most people count the number of pizzas they want before they talk to their Alexa…

But what if we want to be able to expect more from our assistants?

The Future Is Human-Like Conversation

Should conversation designers simply accept that their applications can only be used in this narrow area of goal-oriented tasks? Or does the intelligent systems' future lie in open-ended, human-like conversations on still less strictly delimited topics, allowing them to ask people questions they did not expect and need some time to think about?

Isn’t this what makes true, deep conversations between humans? The fact that we do not always agree with each other, we confront people we talk to with different perspectives, and that each participant in the conversation tries (more or less successfully) to adapt to the point of view of their communication partners?

If a designer tries to approach such a goal in creating their assistant, they will quickly meet the boundaries of the possible with the current tools for conversation design like Alexa Developer Console, CoCoHub, VoiceFlow, or others. One of the most obvious obstacles is the one described above. With questions like “What is your current mood?” or “What is your greatest failure and what did you learn from it?” most people will not react within ten seconds (which is the technical limit for Alexa) — however, scenarios like mental health assistance or HR recruitment assistance require such questions.

It is not easy to find a simple and efficient solution. In fact, it seems that the dialogue architecture coined by Alexa and adopted by most other platforms never took such scenarios seriously — they only offer one re-prompt. If the user doesn’t react after the re-prompt, the conversation is interrupted.

What you would need as a designer is an adaptive response system like this:

INTELLIGENT VOICE ASSISTANT (IVA): “What is your current mood?”

USER stayed silent

IVA: “I know that talking about one’s mood is not an easy task. Tell me at least — is your current feeling more positive or negative?”

USER stayed silent

IVA: “Take a deep breath. Try to think about parts of your body, one after another. Your head. Your face. Your stomach. — Tell me now, what is the first emotion or feeling coming to your mind?”

This relatively straightforward task proves to be far beyond what Alexa can do. Not only because you will need more than one re-prompt, but you may also want to use different intent groups after each speech — and, last but not least, you may want to redirect the communication to a different trail, for example like this:

USER is still silent

IVA: “Okay, it seems to me that you don’t want to talk to me about your feelings. Never mind. Is there something else I can help you with?”

There Is a Solution — Its Name Is Promethist Platform

If you want your applications to react in this way, you should explore Promethist Platform— a new start-up platform that has ventured to develop its own revolutionary conversation model, which — while being compatible with Alexa — goes in many aspects far beyond what Alexa can offer.

Promethist’s model combines the use of standard intent classes (=what the user says) with what they call actions (the kinds of dialogue acts that occur within the conversation). Being silent is a prominent kind of dialogue act that occurs often even in conversations between humans. For our use case, we use the action “silence” — this action can be defined anywhere in the dialogue, on both local and/or global levels, thus allowing a customized reaction which we can make even more adaptive by using simple low-code rules. This logic can be implemented very easily:

Such a solution on the level of conversation architecture opens new horizons for anyone with the ambition to create their own conversation applications. The motto of PromethistAI says that they want to achieve the “liberalization of conversation design”. This can be understood in two ways:

i) creating voice applications becomes accessible to really everyone, even to those with a minimum of technichal skills;

ii) designers creating in the Promethist Platform can realize almost anything, the whole system is much more flexible than the comparatively rigid Alexa and other current systems that basically only mirror Alexa’s model.

Of course, the more complex your ideas are, the more technical skills you will need to realize them — but with basic use cases, you can start essentially without any initial knowledge.

Shortly you will be able to unlock your creative mind. Even though talking about one’s emotions with a digital persona may seem like a futuristic idea, with Flowtorm, this and similar ideas can soon become true. The tool is open for everyone, and the use cases it supports are countless. Just come up with your idea and make it happen in the Promethist Platform!

Would you like to follow our journey? Follow us on Facebook, Twitter, YouTube, Instagram, and LinkedIn.

Check out the Promethist Platform for creating smart conversational AI applications and virtual personas.

Enjoyed the article? Click the 👏 below to recommend it to other interested readers!