Changes in intonation, posture, raised eyebrows, glances and shifting weight all help us keep track of where we are in a conversation. Absent these cues we use conversational pauses and tone. Absent these rougher cues — in messenger platforms or e-mail — we use written history to keep track of our conversation.
The biggest problem with Alexa?
Mike Hudack

This is a great point. And, I guess this will not change. With chatbots, this issue is present as well. They just monotonously respond to commands and have pre-defined “tone”. They do not have reactive tone.

Both voice and textual bots will suffer until NLP *really* gets there and we have some digital screen with robotic facial expressions on the wall.

