How Product Builders Can Mitigate Bias In Conversational AI

Published in

Voice Tech Global

6 min readJul 2, 2020

Conversational AI is about assembling building blocks

Conversational AI is about assembling building blocks and for me, it was my first exposure to Artificial Intelligence. After a couple of years of interacting with it, I quickly fell in love with using it. What I liked most about conversational AI was how accessible it is to get started and do something with it.

Building conversational products today is more about putting pre-built or pre-trained pieces together than becoming a machine learning expert. When I have to describe it to anyone, it looks like this:

An example of a weather conversational application

In the example above, if you build an Alexa skill or an action on Google, you are not in charge of implementing the speech-to-text, the Google Nest Hub or the smart speaker handle it, or the actual natural language understanding (i.e., the algorithm that will take a text and match it to a predefined understanding). You can use building blocks that are already made available by Amazon, Google, Microsoft, or IBM.

AI is a reflection of its creators and their biases

In light of recent events that occurred in the United States, highlighting the problems of systemic racism, and the protests that followed around the world, we have been looking at ways to address the struggles faced by the black community.

In these conversations, technology has not been scrutinized as much, mostly because these systems, rooted in mathematics, are perceived to be based on logic, not human interaction. But when we reflect deeper on it, our existing technology has a single flaw that no proof or algorithm can easily solve: It is built by humans. Therefore, human failings can be built into the machine, however unaware we are of them.

I decided to reflect on conversational AI. As seen in the diagram above, when we build our conversational AI, we rely on that block from Google or Amazon to complete the speech to text functionality. If we look at this recent Stanford study, a lot of people may never hear from your conversational agent because when the device does not do a proper speech to text, it never goes into the next block and therefore African Americans will face difficulties accessing your services.

As I started digging deeper, I realized that the systems we rely on in these products, convenient APIs (for sentiment analysis for example) or pre-trained language models were prone to something called bias.

Bias has as many definitions as the context in which you use it, I picked these two:

The way our past experiences distort our perception of and reaction to information, especially in the context of mistreating other humans.
Algorithm bias occurs when a computer system reflects the implicit values of the humans who created it.

I started looking at my diagram through the lens of potential bias and it now looks like this

If the top uses for conversational AI are entertainment and automation, what happens when the enterprise adoption turns a loan assessment into a conversational agent? If we continue to drill down, what happens if it uses a risk assessment API or model with a bias to conduct the evaluation?

Do the same questions apply to a conversational agent for news feed? What happens when my smart speaker or phone curates Twitter trends but filter the content as it’s considered hate or inappropriate language?

Why aren’t the big tech companies fixing it?

As I kept discovering more and more studies and articles denouncing bias and calling out for changes and improvements, my first thought was “what are Amazon, Google, IBM, or Microsoft doing?”. And the answer is they are trying (Google, IBM, Microsoft), but they have three challenges to overcome:

Despite their best efforts, with a limited diverse workforce, these organizations will have difficulty correcting an issue that may have been caused indirectly by a lack of representation in the first place.
The challenges to solve are complex: In our example earlier of the issues with the understanding of African American voices by speech to text engines, one tentative solution would be to retrain the language models with thousands of African American voice samples. It is not impossible, but it takes time, and unfortunately, sometimes the data is not available.
No one solution can fix all the issues: what works for mitigating the bias in facial recognition may not apply to the struggles in conversational AI.

Top strategies for practitioners to mitigate bias

I started thinking about how the product builders can help. The first thing I noticed was that even though we are using pre-built building blocks, these systems still allow us to do additional training specific to the product that we are working on.

Google or Alexa, for example, allows you to define language models on their platforms. One of the things you provide is called sample utterances or training phrases. This helps the system better understand when people are speaking to it.

Coming back to our earlier examples of poor speech to text, if we take an example from https://fairspeech.stanford.edu/, I found two solutions to mitigate the issues of speech to text and natural language understanding.

Mitigate the speech-to-text bias

When doing high fidelity prototype user testing, using, for example, the Amazon Echo Show, with the team, we would review the Voice history (i.e. the transcript and the recording of what your Alexa enabled device heard).

Voice history of my attempt to set a timer for beans

This would enable us, during the testing, to find out whether Amazon (or any other voice-enabled devices) speech to text was having issues with some accents.

In the example above I want to say “set beans timer for eight hours”. For an application, I could add all the wrong transcripts (for example “set beats timer for eight hours”) to the sample utterances. Even though that’s not solving the root of the problem, it helps mitigate the issue for the users who are victims of the biased text to speech, because Amazon will not just end the conversation or provide the wrong response.

Mitigate the natural language understanding bias

In the Amazon console, in the build section, there is a very useful tool that is a little hidden but that gives you access to a sample of Natural Language Understanding results.

Under the unresolved utterances are shown the utterances that users said but that could not be mapped to an intent. We usually mapped them to the closest intent. As with speech to text before, this will not entirely solve the issue but it provides a good mitigation solution.

This is only the beginning of the journey, join us on our July 09th event to find out more strategies for product builders to mitigate the bias in conversational AI products.

Acknowledgments

A big THANK YOU and shoutout to Polina Cherkashyna, Aimee Reynolds, and Corin Lindsay for editing, reviewing, and ideating with me on this.