Teaching Kamu to speak
Final outcomes from Migri’s chatbot voice experiment
In a previous article we have shared our results from the first user testing round with our voice chatbot experiment. In the first round we have tested the English implementation of our voice-based Kamu with immigrant users in Finland. The overall goal for the “Voice of Kamu” experiment was to understand the maturity of the current chatbot voice technology and then decide our next steps.
During June 2019 we ran a second testing round to understand how the Finnish version of the implementation differs from the English one. In this user tests, we did not recruit real immigrant users, but we relied on our Migri experts to test Kamu in Finnish. We know that this has an effect of the results, but most importantly we wanted to evaluate the quality of the speech-to-text-transcription in Finnish.
While the participant overview only shows 2 locations, the origin of these Migri colleagues includes much more regional variety.
This time, tests took place as individually performed sessions based on written instructions. The instructions included also questions to answer after the tasks were completed.
Users used their work phones and recorded the session with another device (mobile phone or computer). Most participants used the external speakers of their phone, so the echo of the room might have an effect on the results.
Each test user performed two or three tasks, which were the same as in the previous testing round.
We evaluated the results similarly to the previous round with immigrant users and also used the same five areas for our analysis:
Finnish speech-to-text-transcription works better than in English
To us, this was a surprising result: The Finnish speech-to-text-transcription worked more reliably than the English one. Of course, this time we tested with native speakers, while in the previous round we had many participants whose native language was not English. In future we will also want to test the implementation with non-native Finnish speakers and compare the results.
However, we did find problems in the text-to-speech algorithm also in Finnish. Those included long words and the declination of numbers, which will need to be solved for production use.
Change of language during the conversations remains unsupported
We had found in the previous testing round that users cannot switch language during the conversation in the voice-based implementation. This is an affordance our chatbot software has inbuilt, but Twilio does not support it easily.
Same unexpected and unresponsive behaviour
While in the Finnish user tests we noticed less delays in the answers, Kamu cut the phone connection more often compared to the English testing round. We also noticed that our group of Finnish test users tried to interrupt Kamu more often to ask follow-up questions. With these results we see our previous suggestion validated: We need to introduce additional voice-based commands to Kamu if we want to take the voice-based implementation to production use.
Less concern about the talking speed
We saw that Finnish users were less concerned about the talking speed of Kamu. Obviously, our limited target group of Migri experts has an influence on this result and we need to revalidate it with a wider testing audience.
However, one thing we noticed with our Finnish users: They were often confused when the call is ending and who ends the call (Kamu? the caller?). We learned we need to design a better closing for Kamu’s voice-based implementation.
Needed content adjustments stay the same
In our previous testing round we had already highlighted problems with existing Kamu content, which is not suitable for voice-based interaction. This includes, for example, when Kamu says “click below to continue”. During a phone call the user cannot click anything.
Other learnings: Kamu’s voice
In this testing round, we included a question about how suitable the users considered the voice to represent Migri’s customer service. The initial input and thoughts from Migri experts are valuable for the next steps:
At the moment, Kamu’s voice is an out-of-the-box female voice. Interestingly, the users’ impressions of this voice differ from “iron lady” and “robot-like” to clear, peaceful, understandable and suitable for Migri. Users also pointed out that there is a challenge with the pronunciation of some words, especially long ones and those loaned from other languages. Another area of concern is the question if Kamu can use spoken language expressions or not. Migri experts pointed out, that Migri needs to remain a formal and trustworthy authority on the one hand, while at the same time Kamu’s personality is friendly and collegial. When deciding on this aspect we need to balance out those two standpoints.
Our main learning from this feedback: We need to do more research and understand ourselves better the influence of the voice on the content of the conversation. At the same time, we recently stumbled upon a new approach to a generated voice: “Q” is an approach to generate a genderless voice based on research results about the nature of gendered voices. You can listen to it yourself here.
Our voice of Kamu experiment is now over. It was a short-lived experiment understand the type of work needed, before we could go live with a voice-based Kamu: The work ranges from simple content adjustments to more complicated technological questions. As we have explained above, the voice-technology is not yet mature enough to suit the needs of a public organisation like Migri. Hopefully, another experiment will follow with a more mature technology during the next year(s).
author: Suse Miessner