Can Speech Transcription Be Innovative?

Behavioral Signals Team
Behavioral Signals - Emotion AI
5 min readNov 15, 2019

VoiceSignals #8 — Musings on Voice tech news

Most of us think of speech-to-text transcription as a useful tool to avoid typing, like when you’re in your car, and you want to send a message to your spouse or a friend. But some very clever innovators are using transcription and artificial intelligence to make our lives easier in completely new ways. So here we go…

Have you ever tried to create a podcast? You probably know how difficult editing is; removing things, you or your speaker, didn’t want to say, or re-recording sections missed. Descript is changing that -and even challenging the audio editing software industry out there- by making it easy for a user to edit audio by editing the text in the podcast transcription. If that is not enough, to impress any serious podcaster, it also allows you to add text to the transcript and have it injected in the audio by synthesizing your actual voice, which it has already analyzed from the rest of the audio.

What if your discussions with your doctor could be recorded, transcribed, and available at any time? Notable Health uses AI to automate and digitize every physician-patient interaction. It records doctor’s visits and updates the patient’s electronic health records. What’s really interesting is that the company uses voice recognition and natural language processing to automatically record doctor-patient interactions and structure the data for inclusion in the patient’s medical records. While their main target is to facilitate physicians and medical personnel the recorded data could be used by the patient in the future. Imagine the richness of data if they add a layer of emotion recognition technology that will also record the mental state of both patients and doctors during a visit.

If you want to learn a new language you probably might turn to Duolingo, where you’ll find the experience is text-based, with an English interface, and a very limited use of voice recognition for pronunciation training. But thanks to globalization and the English language already established as an international communication tool, several entrepreneurs in countries like China are developing solutions for the workforce who want to learn English and pronounce it properly. ELSA and Liulishuo are two such startups that are using voice recognition and AI to conduct real-world conversations and provide a personal training experience, for each individual user, so that she or he may successfully learn to pronounce English like an… American, as ELSA promises.

We’re bound to see a lot more innovation happening in the industry, as conversational AI develops further, and companies start to include voice in their offerings, so stay tuned.

What we read online…

Is Ekman still relevant when it comes to Emotion AI?

Ekman’s discoveries have been the driving force behind emotion detection in law enforcement, research, and technology for the better part of 50 years. The problem, however, is that never before have these findings (which were previously thought to be nearly universal) been subjected to the kind of scale that voice assistants have put them through. With hundreds of millions of devices now in hands around the globe, cracks start to show in the way the technology is being used based on these six core emotions.

While there are certainly commonalities in human emotion and Ekman’s work has been instrumental in many developments in the last half-century, there are simultaneously nuances that were not previously measured. Take for instance the fact that emotions are consistently defined in different ways by different psychologists. The mere fact that we call them “emotions” is relatively new itself. Read more >

How AI Will Revolutionize the World of eSports

Artificial Intelligence and eSports are a match made in digital nirvana. The tech world is always racing to one-up itself, and pairing AI with the competitive gaming world is a symbiosis that will crown winners among the brave… and leave the others quivering on the sidelines.
Becoming tastemakers and thought leaders on the front lines of eSports domination requires innovation. Cutting edge products will enable creators to bring their visions to life, which will convince investors to back those at the cutting edge of the field, in turn drawing more clients and consumers to the projects accelerating the fastest due to their implementation of Emotion AI. It’s a chain reaction of success, and the proverbial fuse is about to be lit.
Read more >

Speech Emotion Recognition vs Sentiment Recog.

Want to read more into the research being conducted at Behavioral Signals, by the Machine Learning team, on Speech Emotion Recognition? Read the latest research paper outline by Thodoris Giannakopoulos, Director of Machine Learning at Behavioral Signals, on Unsupervised Dimensionality Reduction in Speech Emotion Recognition. As Thodoris mentions, in this Medium article, “SER focuses on automatically analyzing speech signals to extract the underlying emotions of the speakers. This is more a matter of terminology, but SER is not to be confused with Sentiment Recognition, which is text-based analytics applied either on text documents or on the output of ASR or STT (Automatic Speech Recognition, Speech To Text) Systems. SER, on the other hand, makes use of the audio information itself, through analyzing the low-level audio features that are directly related to the spectral and prosodic characteristics of a human’s voice”. Read more >

Why Music Makes Us Feel, According to AI

Your heart beats faster, palms sweat and part of your brain called the Heschl’s gyrus lights up like a Christmas tree. Chances are, you’ve never thought about what happens to your brain and body when you listen to music in such a detailed way. But it’s a question that has puzzled scientists for decades: Why does something as abstract as music provoke such a consistent response? In a new study, a team of researchers at the USC Signal Analysis and Interpretation Laboratory (SAIL), with the help of artificial intelligence, investigated how music affects listeners’ brains, bodies, and emotions.
Read more >

Written by Vicki Kolovou for Behavioral Signals

Do you want our bi-weekly newsletter in your mailbox? Sign up here https://behavioralsignals.com/sign-up-for-our-newsletter/

--

--