Watson Speech improvements for British English, German, and French

Rachel Liddell
IBM Watson Speech Services
3 min readAug 19, 2020

--

Speech recognition quality is key to answer your customers’ questions, reduce average call handling time, and to deliver the best possible customer experience. In order help our clients meet these goals, the IBM Watson Speech team has upgraded our British English, French, and German speech to text models. Using deep learning and neural networks, we improved the models’ performance for customer care, focusing on the telecommunications, banking, retail, and insurance domains. The upgrade delivered substantial improvements in accuracy. Word error rates decreased by as much as 50% relative to previous model versions. These upgrades will help you communicate with your customers, support your employees, and caption media to increase accessibility.

Alphanumerics and Accents

Beyond general accuracy, we targeted some particular transcription tasks. We improved the models’ performance on letters and numbers, expanded their vocabularies, and strengthened their acoustic robustness for background noise and speaking styles. We also increased the versatility and flexibility of British English to better handle different regional accents. Accuracy increased for Irish, Scottish, Welsh, Midland, and Northern English accents, as well as accents across Greater London.

Model Types

All of our languages support both telephony (narrowband) and high quality (broadband) audio. Telephony audio has a sampling frequency of 8kHz and is typically found flowing through call centers and over landlines. The narrowband model is designed for telephony, but you can use it for audio from other sources as well. We design our broadband models to transcribe audio with a sampling frequency of 16kHz. You’ll find this audio in meeting recordings, video conferencing applications, mobile applications, and rich media sources. The German and British English updates apply to both the broadband and narrowband models. For French, we’ve released the broadband model, and the narrowband model is on its way!

Access the upgrade

You can start using these upgraded models here. If you need guidance, check out our documentation. If you’re an existing user of British English, French, or German, we will automatically upgrade you to the newest model. That way, you can immediately use the update. If you currently use custom models, we won’t automatically switch you over to the new version. That way, your performance stays consistent. When you’re ready to benefit from the release, upgrade your custom model using these instructions.

If you want to learn how to assess these new models, check out our article “How to Properly Evaluate Speech Models.”

New Voices

Also, don’t forget to listen to our new and improved neural voices. We improved the naturalness of Kate, a British English voice, and added two new neural British English voices, James and Charlotte. We also added the Nicolas voice, a new neural voice for French. You can listen to them here. These improvements enhance and broaden our support for French and British English.

Feel free to reach out with any questions or concerns. Happy transcribing. :)

--

--

Rachel Liddell
IBM Watson Speech Services

Rachel is a Product Manager for Watson Assistant. She focuses on channels and integrations.