Ricardo Balduino
Inside Machine learning
9 min readOct 29, 2019

--

AI and Machine Learning to Improve Customer Contact Experience

Think of the last time you contacted a call center (whether it was your bank, credit card company, airline company, or any other company). Did you have to wait a long time after pressing all the numbers on your phone’s screen to move through the prompts to finally be able to speak to a live person? And have you had to call them back more than once just to explain the same situation to the next agent all over again?

Organizations are realizing that their call centers (and other customer interaction channels) offer a unique opportunity to improve their customers’ satisfaction and engagement, by offering a consistent experience across all channels and rapid response to questions and resolution of issues. There are other added benefits for companies as they aim to reduce the operational costs of call centers, for example, by shifting interaction to lower-cost channels such as intelligent Chatbots, reducing handling time of calls, and so on. This ultimately gives the added benefit of growing revenues, by increasing customer (and service representative) retention, as well as turning some of the customer interactions into potential new sales.

Imagine if every service representative could identify the caller’s intent and engage in a conversation that aligns with that intent and the business objectives.

For example, a customer calls about being late on a lease payment for the current month, because of a job transition, advising that the paycheck won’t arrive until the next month. Instead of the service rep asking the usual dozen questions, imagine if the system could quickly identify the caller’s intent as “schedule payment” and qualify them for later payment. This information could be relayed to the service representative in real-time, who is guided to offer a customized payment option. The end result is reduced call handling time and a positive customer experience.

While this is not a new problem — and there may be different solutions to solve it — the use of Artificial Intelligence and Machine Learning has been at the forefront of recent implementations. The IBM Data Science Elite Team has been engaged with several companies in industries such as banking, financial, and healthcare, to name a few, where they have used AI and ML tools and techniques as a solution to this problem.

The typical workflow for analysis of call transcripts is shown in Figure 1. Several IBM tools and open-source libraries have been used to implement the steps of this workflow, as explained below.

Figure 1 — Analysis of Interactions in Call Center Transcripts

To start the analysis and machine learning modeling process depicted above, input data is required, which in this type of use case is typically provided by using speech-to-text transcription of the recorded calls. Note that this process can also be applied to other sources of customer interactions, such as chatbot transcripts, or email body and attachments. For this blog, we focus on using the call transcripts as input. Also notice that this process is iterative — it can be repeated as more data becomes available, or there is a need to review the content, or need to fine-tune existing — or create new — machine learning models.

Speech to Text Transcription

Speech to text technology continues to progress in its ability to deliver high quality transcriptions. One of the main considerations is to ensure that the quality of the audio is adequate for transcription: higher quality audio can help drastically improve outcomes. For example, while designing a solution, it is a better choice to use audio sampled at 16 kHz rather than at 8 kHz as the source for transcription service.

Transcription is done with IBM Watson Speech to Text which is a robust technology that provides real-time, high quality transcription, to recognize multiple speakers, capture possible word alternatives, and filter the content. It supports multiple languages along with different dialects. The output is provided in JSON format which is offers easy integration with applications processing the output of the transcription in real-time.

Here is an example of an application using IBM Watson Speech to Text: https://speech-to-text-demo.ng.bluemix.net/

Ingest and prepare the data

This step is done programmatically using a language such as Python, in a Jupyter notebook inside IBM Watson Studio.

A frequent challenge is to remove sensitive information such as account numbers, person names, etc. from the transcription. IBM Watson Natural Language Understanding (NLU) is used for this task. NLU’s entity recognition has an out-of-the box solution for identifying people, places, companies, etc. IBM Watson Knowledge Studio can be used for building additional models for identifying new types of entities and their relationships. After Watson NLU tagged the sensitive information in the text with the entity types, it can be replaced programmatically with placeholders.

One of the tasks is to break down the entire call transcript into fragments or snippets, which is useful in analyzing intent as the call progresses (e.g. someone may call asking about their benefits or deductible, and end up having to pay for services, or set-up auto-payments). Various techniques are applied to come up with the best snippet size (i.e. number of sentences in a snippet) and snippet overlap (i.e. fixed windows or sliding windows through the call transcripts).

Once calls have been broken down into snippets, standard Python libraries are used to further process the data, such as removing stop words, applying, identifying bi-grams and tri-grams, and tokenizing the data (i.e. extracting terms from the call snippets so that the analysis is focused on the main terms that make up each snippet, as showed on Figure 2).

Figure 2 — Tokens extracted from call transcripts snippets

Develop a clustering model

Once the data is prepared, an unsupervised machine learning technique, such as clustering is applied, to have an initial identification of topics, or potential intents, for each of the snippets. We use Python libraries such as Gensim and pyLDAviz, and the output is an interactive diagram showed on Figure 3, with the topics (or clusters) represented as circles on the left side, and salient terms for each topic listed on the right. (If you are interested in the details about this technique, you can find the explanation in this Medium blog).

Figure 3 — Output of pyLDAviz in Watson Studio showing topics and salient terms

The step above does not have any domain input, as it is an unsupervised machine learning technique being applied to the data, and as such, the topics have numbers, but no significant names pertaining to a given domain. We apply some judgment based on the salient terms and word clouds generated for each topic and give a ‘tentative’ name that can be validated by domain experts later on.

Review and label call transcripts with SMEs input

The prepared data annotated with a list of topics can now be reviewed by domain experts who will help validate the assigned topics and label the data that will become the ground truth to train predictive models later on.

This step can become a huge task if done manually (i.e. to review each call transcript, or even more so, to review each snippet of each call transcript in order to label them accordingly). To aid this process, a content mining tool is used. This allows domain experts to slice-and-dice the content and find patterns in the data more quickly, by using Natural Language Processing (NLP) capabilities such as parts of speech (verbs, nouns), phrases, sentiment analysis, and so on, as well as querying and various visualization options available in the content miner tool. IBM Watson Natural Language Understanding and Watson Knowledge Studio may also be used for initial annotation of the content. Domain experts can then annotate — or label — at once, several snippets that satisfy a given criteria. The end result of this step is a set of labelled data based on reviewed (and refined) set of intents. Figure 4 shows a typical user interface for content mining, which is currently available in IBM Watson Discovery.

Figure 4 — Reviewing intents and labelling snippets in content miner

Develop supervised ML models for classification

In this step, the labeled data (i.e. snippets and intents) from the previous step is loaded into IBM Watson Studio in order to train a machine learning model for classification of call snippets. In this example, since there are multiple intents that could occur during the progression of a call, a multi-class classification model is developed to predict the intent of each call snippet.

One of the key challenges is availability of sufficient labeled data. Frequently, organizations have masses of information, but just a few hundred labeled samples. In this case, transfer learning and semi-supervised machine learning were successfully used to address this challenge.

In addition, each intent will have a different number of samples available, also creating a challenge. Class balancing and class weighting are some approaches available to tackle this issue. Further, a performance metric is selected, such as accuracy, based on the business objective. If there is a class imbalance, accuracy won’t be a good metric. Advanced metrics like micro and macro precision and recall could be used in that case.

Techniques such as TFIDF, embedding, and sentiment analysis can be applied using standard libraries and packages available in Python. A typical approach of splitting the dataset into training and validation if applied and using a 5-fold cross validation on the training set, which helps minimize overfitting — helping the model to generalize well on unseen data. It is typical in these use cases to have a separate holdout dataset that can be used to perform scoring against. Some examples of call snippets and predicted intent generated by a machine learning model are shown in Figure 5.

Figure 5 — Sample call snippets and predicted intent

Deploy and manage models

Once one or more machine learning models have been created and tested in a development environment in IBM Watson Studio, the models and other project assets such as notebooks, scripts, datasets, etc. can be tagged and version controlled using the native Git integration, or to one of the supported external repositories.

The next step is to create a project release in IBM Watson Machine Learning that contains the versioned assets. In IBM Watson Machine Learning users can deploy assets into production, monitor their performance, and retrain models as the model’s performance changes with time.

Integrate ML models with applications

REST endpoints for deployed assets are available for integration with business applications. For example, even though the machine learning models in this example were trained to predict intent of call snippets, the same models could integrate with a chatbot application, adding intelligence to the chatbot conversation, as the models can be invoked in real-time to predict the intent of the customer interacting with the chatbot.

An advanced example of integration is a cognitive application that assists an agent with:

· transcribing audio in real time and enriching the transcription using the developed models by annotating the content with customer intent

· fetching the context-specific help for the discussed topics

· bringing additional support from the team if distress is detected during the conversation

Conclusion

The blog post presented several tools and techniques that can be used to enhance customer interactions by predicting the intent of customers in a call center or chatbot setting, helping organizations improve their customers’ experience and minimize operational costs.

This article maps the components and technologies needed for developing cognitive applications on unstructured data. Topics such as ML operationalization, ML model development, monitoring model fairness, and explaining model decisions will be covered in future blog posts.

If you have a similar use case — or any other AI use case — and would like help, please contact the IBM Data Science Elite Team.

This blog was co-authored by: Ricardo Balduino, Aleksandr Petrov, and Vinay Rao Dandin. The solution presented here has been co-developed by the blog authors in collaboration with: John Thomas, Avijit Charterjee, and Maxime Allard, all from IBM.

--

--