Debiasing speech recognition systems for kids

Amelia Kelly
The SoapBox Tech Blog
2 min readJan 12, 2022

What is debiasing?

Artificial intelligence systems will reflect the conscious and unconscious biases of their creators and create poor — and often prejudicial — user experiences for underrepresented users. Machine learning algorithms are unique in that they carry out decisions based on what they’ve seen within a supplied dataset, rather than being explicitly programmed using a predetermined set of rules. Building a system based largely on data from one demographic will result in accurate performance of the speech recognition system for that sub-group, but inaccurate performance for all others.

Debiasing is the conscious and intentional process of counteracting inherent biases that are found in databases and artificial intelligence systems.

Intentional processes, such as utilizing a varied, diverse, and proportionately representative dataset of voices to train machine learning algorithms, can reduce or remove the presence of unintended bias in voice technologies.

Why is debiasing important for SoapBox’s voice engine and AI systems in general?

A biased system can amplify and propagate deep-seated prejudices held by the designers of that system, be they explicit or unintended. The effects of such biases in the context of educational technology, assessment platforms, and learning tools for kids can be disastrous.

For example, if a biased system fails to understand a child’s accent or dialect, it can consistently tell that child he or she is a poor reader when, in fact, they are reading correctly. An unbiased system, on the other hand, can offer fair and uncompromised information to facilitate edtech platforms and services. AI companies need to make a concerted effort to debias their technology.

--

--