Recommendations for Hierarchically-Classified Dispositions in Chat Service
In order to provide an exceptional online customer experience, it is required to recognize the customers’ behaviors, needs and issues faced by them. One of the ways to gather these insights is by analyzing the characteristics of the incoming chat support requests. In our chat service, these characteristics are defined in categories and subcategories, together called as Dispositions, such that all the chat requests are categorized into them.
After the chat between the customer-support agent and customer has ended, the agent must infer and select the disposition for that chat. To increase the efficiency of the customer-support and also, to enhance the interactivity of the chat portal, the time taken for repetitive tasks such as determining the chat disposition can be fairly reduced. It can be done by automating the process of analyzing the chats in real time and recommending the dispositions to the agents after the chat completion.
Problem Statement
Infer and recommend the most suitable disposition for the conversation between customer and customer-support agent, from a set of hierarchically-structured dispositions.
Existing Data Overview
The two-level hierarchy of chat dispositions is designed such that there are 34 Level-I Dispositions and each of these have their corresponding Level-II dispositions which together comprises a total of 148 dispositions.
Due to hierarchical classification of dispositions, the idea is to predict the Level-I Dispositions for the chat and then based on that, the Level-II disposition can be predicted. This article talks about the first iteration of predicting Level-I dispositions.
Since 80–85% of the conversations are categorized into the 7 dispositions out of 34 Level-I dispositions, therefore, in first iteration top 7 dispositions are picked for prediction.
Every conversation in the database has its ID, messages within that conversation along with message date and user type(agent or customer), and corresponding dispositions against it.
The chat messages between user and customer-support agent are in the form of stream of strings, but they need to be converted into numerical features or vectors so that machine learning algorithms are applicable to them. Thus, Vectorization needs to be performed for all the chat messages which are considered as ‘document’ in the algorithm.
Difference between TF-IDF Vectorizer and CountVectorizer
CountVectorizer counts the frequency of the words/tokens in a text.
With the TF-IDF Vectorizer the value increases proportionally to count, but is inversely proportional to frequency of the word in the corpus; that is the inverse document frequency (IDF) part.
The inverse document frequency scales down the effect of the words appearing more frequently in general. For example, even though words like “we”, and “the” appear often in documents, they don’t tell me a lot about what makes this document unique.
Programming Language used→ Python
Steps in creating the machine learning model
- Figure out the distribution of the dispositions across the existing conversations and choose top 7 out of 34 Level I dispositions due to long-tail distribution and coverage is still 80–90% whereas accuracy increases drastically.
- Conversations are sampled based on the Level-I dispositions and stored the same in CSVs. The system first predicts Level-I dispositions and then further Level-II dispositions are predicted based on Level I.
- Pre-process every conversation (which acts as a document) in the following steps:
a. Word Tokenization → It is a method of segregating a text into smaller units of words called as tokens. It is used to create a vocabulary, which is a set of unique tokens in the corpus. TF-IDF Vectorizer and Count Vectorizer use vocabulary as features. Each word in the vocabulary is treated as a unique feature.
b. Punctuation Removal → All the punctuations must be removed as they do not provide any context in predicting what the document or chat is all about.
c. StopWords Removal → StopWords are the most common words in any natural language, which do not add much meaning to a text. Example: “There is a book on the table”. Now, the words “is”, “a”, “on”, and “the” add no meaning to the text while parsing it. They should be removed so that the words providing meaning to a document can be given more weightage.
d. Lemmatization → It is a method of grouping all inflected forms of a word into its root form, having the same meaning. It is the morphological analysis of the words . Example: Troubled, Troubling, Trouble are grouped as ‘Trouble’.
Pre-processing pipeline Code Snippet →
4. Convert the text messages into weights using TF-IDF Vectorizer →
a. Document for a vectorizer can be individual sentences or even the aggregated chat itself. Here we take one complete chat as a document, and a set of these documents is called corpus.
b. Then check the top keywords by applying it on the whole corpus.
c. Check CountVectorizer first disposition-wise and create a ranked list, then apply TF-IDF Vectorizer to get the top weighted words from there, and build the vocabulary.
Vectorization Snippet
5. Sampled and pre-processed data is divided into test-train data using scikit-learn library.
6. Apply K-Neighbors Classifier with k_neighbors value = 50 → following is the report with different values of k in k-neighbors Classifier. It is to be seen that very small values of k will lead to a sensitive model as we are just considering a single point to predict the disposition. As the values increase for k, the accuracy increases because we can have a well defined class of points belonging to one disposition.
7. Dump trained vectorizer and model into pickle files, which can be further used for prediction on input chats.
How do we calculate the accuracy of this model?
- Accuracy Score
- F1-Score
Production deployment
The following flowchart suggests the architecture of this disposition prediction system, which is named as Synapse DS Service.
- Setup infrastructure (EC2 instance and load balancer)
- Trained Vectorizer and Model as dumped into pickle files are uploaded on S3, which can be further pre-loaded when we have to predict input conversation.
Further improvements
With the further iterations, the efficiency of disposition prediction can be improved as it is a categorical classification where Level-II prediction will have to consider more variables and machine learning approaches. Following improvements can be done:
- Creating a cron task so that the model and vectorizer can be re-trained at regular intervals incorporating the recent conversations between user and support-agents.
- Different classification models like Linear Regression, XGBoost etc. can also be used in-place of KNN Classifier and their accuracy levels can be analyzed.
- AB segregation can be performed.
- For Level II dispositions, different prediction models can be developed but they will increase the maintenance cost of the system.
- Multiple relevant dispositions can be suggested as top 2 or 3 predicted dispositions.
References
[1] CountVectorizer, TF-IDF Vectorizer, Predict Comments https://www.kaggle.com/adamschroeder/countvectorizer-tfidfvectorizer-predict-comments
[2] Scikit-Learn → Machine Learning in Python https://scikit-learn.org/stable/index.html