Robustifying Conversational AI with Counterfactuals

Sinch
Sinch Blog
Published in
5 min readAug 16, 2023

Hi, I’m Maarten De Raedt, and I’m Natural Language Processing (NLP) engineer at Sinch.

Hello, I’m Fréderic Godin, and I’m Head of Artificial Intelligence (AI) at Sinch.

Maybe you don’t know, but EMNLP, or Empirical Methods in Natural Language Processing, is a renowned annual conference in the field of Artificial Intelligence that focuses on practical, data-driven approaches to Natural Language Processing (NLP) tasks. During the EMNLP 2022 conference, we had the opportunity to represent Sinch Chatlayer and presented a research paper aimed at enhancing the out-of-distribution robustness of conversational agents.

At EMNLP 2022, our primary objective was to address the challenges posed by distribution shifts in conversational AI systems. Distribution shifts occur when the data distribution during training differs from that encountered during real-world inference. The paper sought to adapt intent classifiers to handle these distributional differences, while minimizing additional annotation cost and computational overhead.

To begin clarifying things, it’s important to know that conversational AI platforms, such as Sinch Chatlayer, can ensure robustness against distribution shifts. This becomes crucial in maintaining high intent prediction accuracy for diverse user queries with varying styles, grammar, and vocabulary. Nothing is worse for a person chatting with a conversational AI platform than asking a simple question and receiving the wrong answer or, even worse, getting no response at all.

Understanding Distribution Shift

To try understanding why all this can happen, we need to know that in the context of AI and NLP, distribution shift refers to a discrepancy between the data distribution used for training AI models and the distribution encountered during real-world usage. It can negatively impact AI models’ performance and generalizability, leading to reduced accuracy and reliability when encountering out-of-distribution data.

Distribution shifts are common in conversational AI scenarios. For instance, chat-based queries and transcribed speech-to-text queries may have slight differences, and formal and informal user queries can vary in phrasing and lexical usage, all contributing to distribution shifts. Let’s see some examples:

  • Example 1: Channel-Based Distribution Shift

Chat-based query: “What time does the store close today?”

Transcribed speech-to-text query: “I’m planning to stoop by the store after work and was wondering what time it closes today. Oh, and do they have any ongoing promotions? Thanks!”

In this example, both queries pertain to the same intent of inquiring about the store’s closing time. However, the transcribed speech-to-text query contains a small mistake (“stoop” instead of “stop”) and additional contextual information that is irrelevant to the intent. These characteristics emphasize the distribution shift between the more concise chat-based queries and the imperfect transcriptions.

  • Example 2: Style-Based Distribution Shift:

Formal query: “Could you please provide me with the product specifications?”

Informal query: “Hey, what’s the deal with this product? Tell me all about it!”

In this example, the distribution shift occurs due to differences in the level of formality between two users. Both queries convey the same intent of requesting product specifications but are phrased distinctly with little to no lexical overlap.

As you can see, these two examples highlight how distribution shifts can occur due to variations in the mode of input, posing challenges for accurate intent classification.

Improving Robustness with Counterfactuals

To try solving all this, we decided to improve the robustness of text classification models under distribution shifts based on counterfactually augmented data. These are minimally revised samples of original data, designed to highlight distinct features that strongly contribute to the desired output. By training on both the original and counterfactually augmented data, AI models can become more robust to out-of-distribution data. One real-world application of counterfactuals is in the task of sentiment classification.

For example, Kaushik et al. (2019) introduced Counterfactually Augmented Data (CAD) for sentiment analysis. They used IMDb movie review dataset for training a sentiment classifier. Human annotators minimally revised each training data sample to flip their labels. By training on both the original and counterfactually augmented data, the robustness of the sentiment classifier was significantly improved on out-of-distribution data.

Consider the following examples from Kaushik et al. (2019) of counterfactual revisions made by humans for IMDb movie reviews:

Original Sample 
 
 Counterfactually Revised Sample 
 
 Negative 
 -> 
 Positive 
 
 One of the worst ever scenes in a sports movie. 3 stars out of 10. 
 
 One of the wildest ever scenes in a sports movie. 8 stars out of 10. 
 
 Positive 
 -> 
 Negative 
 
 The world of Atlantis, hidden beneath the earth’s core, is fantastic. 
 
 The world of Atlantis, hidden beneath the earth’s core is supposed to be fantastic.
Table 1: Two examples from Kaushik et al. (2019) of counterfactual revisions made by humans for IMDb.

The intuition behind using counterfactuals is that the classifier can only rely on the distinct features between the original and its counterfactually revised sample. These distinct features are the minimally edited words or phrases and thusstrongly contribute to sentiment. By training on counterfactually augmented data, the classifier is constrained to rely only on features that are indicative of sentiment, rather than spurious non-sentiment related features.

The practical impact of training on both original and counterfactually augmented data was observed in the zero-shot sentiment transfer for Amazon reviews, tweets, and Yelp reviews (Kaushik et al., 2019). The sentiment classifier trained on IMDb movie reviews, but evaluated on different domains, demonstrated substantially improved zero-shot sentiment accuracy compared to the classifier trained only on original data.

Improving Robustness with a Minimum Number of Counterfactuals:

While counterfactually revising each training sample and training on the original and augmented data significantly enhances out-of-distribution generalization, manual annotation of counterfactuals remains costly. To address this, at Sinch Chatlayer, we explored a novel solution to minimize the extra annotation efforts required by Chatlayer’s users while still leveraging counterfactuals effectively.

Our proposed solution involves annotating counterfactuals for only a small fraction (e.g., 1%) of the original training data. For the remaining 99% of the original samples without manual counterfactuals, we generate counterfactuals automatically in the vector space of transformer-based encoders. Our approach employs a simple linear transformation that can be computed on a CPU within seconds, minimizing the additional computation overhead required for counterfactual generation.

Using the evaluation setup employed by Kaushik et al. (2019), training on the original IMDb movie sentiment reviews, along with the 1% manually annotated counterfactuals and the generated counterfactuals for the remaining 99% of the original reviews, we achieved notable out-of-distribution accuracy improvements. By adding just 1% of manual counterfactuals, we observed an accuracy boost of +3% compared to adding an additional +100% of original in-distribution training samples. Increasing the number of manual counterfactuals to 7.5% led to further out-of-distribution accuracy gains of up to 6%.

Well, we tried to introduce the challenge of distribution shifts and their impact on intent classifier to you, and we truly hope you were able to explore the application of counterfactuals to improve the robustness of these classifiers.

For those interested in diving deeper into our research, you can find our full paper, created in a collaboration with Ghent University, with Prof. Demeester ad Prof. Develder, titled “Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals” at the following link: https://aclanthology.org/2022.emnlp-main.783/.

Interested to learn more about Sinch and perhaps become a part of our team? Check out our Careers page!

References

Kaushik, D., Hovy, E., & Lipton, Z. (2019, September). Learning The Difference That Makes A Difference With Counterfactually-Augmented Data. In International Conference on Learning Representations.

Maarten De Raedt, Fréderic Godin, Chris Develder, and Thomas Demeester. (2022, December). Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11386–11400, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.

--

--

Sinch
Sinch Blog

Follow us to stay connected to our minds and stories about technology and culture written by Sinchers! medium.com/wearesinch