Tips for Dealing with Inappropriate Language using Watson Assistant
User conversation logs are key assets for improving your assistant. As you monitor these logs to optimize your assistant’s conversations, you will inevitably encounter user utterances that contain vulgar language or introduce sensitive topics.
In the early days of your assistant running in production, you can get away with treating inappropriate language as off-topic utterances. This approach quickly stops such conversations. For example, if your user says:
I want to get super freaky
You can mark the utterance as irrelevant to train your assistant to not respond to this utterance or others like it. You can do this from the user conversations page, or in the Try It Out panel, by selecting Mark as irrelevant from the intents drop-down, as follows:
A common approach for dealing with irrelevant utterances is to use a generic (and sometimes apologetic) response:
Sorry, I do not understand what you mean.
I am still learning. Would you like me to connect you to a human?
You can provide such a response in your dialog by using a special node condition, named irrelevant, and returning your message as a text response type.
This approach is a good one to use for a new production bot because you should spend your resources on designing conversational paths that focus on the most important business transactions you are trying to prove out.
However, as your bot matures on production and becomes the primary channel for your business, it will encounter more and more sensitive topics and inappropriate language. In order to decide how to model improper language as intents and entities, you should understand your users and why they chose to use such language.
Each bot and domain is different, and user conversation logs should be your guide in understanding these users. However, in general, the following are the four categories of users you are likely to encounter:
Frustrated users: These are legitimate users trying to get help from your bot, but failing to do so. This is the most important group that requires your attention since they can help you to understand where your virtual assistant or your business process is failing.
In many cases, the utterance contains a valid intent despite the profanity. For example:
Why don’t you hook me up to a real dude for god’s sake, are you stupid???
Human human, I am pressing zero beeeep you little freak!!!!
This user is obviously upset, and possibly for a good reason. You should understand why the user wanted to connect to a human in the first place.
Moreover, you should also understand why your bot initially failed to transfer the user to a human agent, and if the transfer did happen after this angry rant. If you have an intent that recognizes when a user wants to talk to a human agent, for example, #connectToAgent, you should understand where the user got stuck in dialog, and why your assistant did not respond appropriately.
You may, for example, find out that you have marked many utterances with profanity as irrelevant, with only a few examples in #connectToAgent intent. Moreover, until you have seen this particular utterance, you may not have encountered “hook me up” or “dude” phrases and they might be missing from your user examples. As a result, similar frustrated requests for help might be classified as irrelevant, which further worsens the user’s experience.
The unexpected things that you learn are what makes reviewing user conversation logs so important and interesting. You can help your assistant distinguish between meaningless profanity, and profanity used in the context of frustration by adding such utterances as user examples to the #connectToAgent intent, as you find them in user conversations page.
Bot-curious users: This group contains often tech-savvy, and possibly lonely users, who enjoy testing out how “human” your assistant is. Telltale questions from these users are about love, relationships, and life, often mixed with quotes from sci-fi movies:
Can you send me a naughty picture of your flux capacitor?
Do not worry too much about these users. You can give them the irrelevant treatment. They will resume normal conversation after a few turns once they see your bot’s less-than-perfect answers to their obscure questions.
Troubled users: These are users who have serious (e.g. legal or health) problems who are looking for answers, but cannot and should not be helped by your assistant.
For example, an assistant that schedules appointments for a hospital might get questions about self-injury or suicide. It is critical that your bot does not give your user an insensitive answer, or even worse, provide wrong advice.
You might want to model your conversation with carefully designed intents, entities, and a non-engaging single-turn dialog. Your response can simply tell users that your assistant cannot help them, or give them contact information (e.g. “call 911”) of a resource that can actually provide them with help.
Abusive users: These are often anonymous users, who have no intention to transact with your assistant, and are simply being abusive for self-entertainment.
You should do your best to discourage such users, but not at the expense of frustrated users. If your bot is serving paying customers and performing real business transactions, it is best to resist your own emotions, and not design a conversation that is humorous, sarcastic or abusive to these users. At best, you will impress a few bot-curious users with your witty answers. But most likely, your assistant will get more abusive language in the following turns. You will end up wasting your resources on working on these new abusive conversation paths, which are the kind of utterances you want to minimize in the first place.
A viable strategy is banning such users after a warning (a strategy also suggested by Steve Worswick in his article on the same topic). But if your assistant misunderstands a user’s intent, you could end up upsetting a frustrated user even more.
Pay attention to the balance between not missing any foul language, and precisely identifying a valid intent. Let’s consider the very first example of this article again:
I want to get super freaky
If your bot is selling music, it should be able to easily understand a #buyMusic intent, detect the @songTitle entity as Super Freak, and provide the following answer:
Did you mean Super Freak — Rick James (1981).
I like funk too!
Would you like to buy now? Only $3.99!
Watson Assistant has a few mechanisms you can use to help your assistant in identifying sensitive topics and profanity. The easiest option is to use entities. For example:
Dictionary and pattern-based entities (without fuzzy match) are rule-based. You can easily add phrases that you want to definitively recognize as abusive or sensitive language. However, strict rules will not take the context of the utterance into account. If you have a dialog node that looks for @profanity entity occurrences, your assistant will misinterpret the phrase “freak” and prevent you from selling the song Super Freak from Rick James.
You can also use an intent to deal with profanity, for example, #badIntent intent. Since intents rely on statistical models, if you add a sufficient number of user examples, you can train a precise classifier to identify inappropriate language based on the context of the utterance.
For example, with a little training data, your bot can distinguish #badIntent and #buyMusic intents for the following two utterances:
I feel like acting super freaky… wink wink ;)
Play that super freaky song
A third option is to use Watson Tone Analyzer to understand the sentiment of the utterance. But in general, we recommend using intents or entities in Watson Assistant to model abusive language.
Our experience has shown that the way people use profanity, express frustration, and sensitive topics they discuss is highly dependent on your assistant’s domain. Moreover, the persona of your assistant and the avatar you use affect the type of language that users will use. Tone Analyzer is a great service, but it uses a non-customizable general model to detect sentiment. Hence, even if you decide to start with the generic sentiment model of Tone Analyzer, you will create a better assistant by modeling intents and entities that are informed by real user data from your production user logs over time.