Response recommendation system for intelligent contact center. Part 1

3 min readJan 20, 2018

“AI”, “Machine learning”, “Bots” have been the most fashionable words in the world of technology for last 5 years. Before, when people heard the word “bot”, they usually would think about a robot or automated game player. Today, people think about Facebook, Telegram, Skype, Slack messenger built-in extensions which are able to perform a limited set of actions like retrieving some results on button click action. This kind of thinking makes it somehow stupid or at least not complicated. Maybe this is the reason why big companies like Interactions prefer to call their automated programs IVA (Intelligent virtual assistant) instead of just bots. Personally, I don’t see any difference in how to call them, bots or IVA. What is more important is that their capabilities are only limited by our imagination. Here, at AILabs, we believe that automated systems like bots will be able to fully replace humans in many important jobs like customer support or analytics in next 5 to 10 years. But today, bots are far from being truly intelligent. Indeed can process up to 80% of customer support queries. Therefore we are working hard to take all the advantages of both human beings and bots, then translate them into the best customer-business interaction experience. In other words, we are building AI driven cloud-based contact center software as a service.

To achieve this goal, there are a lot of challenges we are going to face in a long run:

- User intent understanding

- Reducing customer effort

- User-friendly interface for operators

- User identification and personalization

- Action recommendation system for operators: response recommendation, etc..

- Hight quality Speech recognition for closed domain

- Real-time Insights visualization

Today’s topic is about Response recommendation system for operators, actually about part of it. The idea of the project is to say whether the two queries (question or set of user-operator dialog messages) expect the same answer or not. For example, both “How to open a bank account” and “I’d like to open a bank account” questions lead to the same answer. Using materials of one of the AILabs enterprise clients from the finance industry, we have run several experiments for this task.

Applied datasets:

1) User questions and operator answers from the forum (UE). Around 250 annotated examples by our team members.

2) Quora duplicate questions dataset (Quora). Luckily in the second half of last year, Quora has published duplicate question classification problem on Kaggle. The training dataset consisted of 400,000 examples in English. From this dataset, we took 1000 examples and translated them into Russian using Google Translate API.

3) Knowledge database of question and answer pairs (QAdb). 350 different intents with 3–10 variations each. The existing bot uses this dataset to automatically interact with customers

Why taking several datasets?

Quora question pairs are not much related to finance but still can be a good help since the solving problem is same: say whether the two questions are duplicates or not. We call this “Joint learning”.
Both UE and QAdb contains important knowledge in finance

Training and testings

All 3 datasets have been taken for training
Part of UE have been taken for testing

Additional knowledge:

- Word2Vec. We have trained Word2Vec matrix on 100 MB finance corpus. The windows size = 5, vector dimensions = 100

Machine learning model architecture:

4 layer Neural networks based model:

1) Input layer with dimensions (MAX_SEQUENCE_SIZE, EMBEDDING_DIMENSION). Each word in a sequence is represented by it’s pre-trained Word2Vec vector.

2) one-directional LSTM with dropout 0.4 regularizer

3) Fully connected layer with size 64, L2 regularizer and linear activation function

4) Binary fully connected output layer with softmax activation

Results:

Experiment 1) Training on only UE dataset gave us accuracy of 50 %

Experiment 2) UE + Quora — 74%

Experiment 3) UE + Quora + QAdb — 80%

The results show that by applying datasets from the different areas we can increase accuracy to every separate dataset.

Thoughts for future work:

In this work, we considered all question pairs as independent entities. But they are quite related and we might gain from this fact by running clusterization and other techniques in order to extract features from their relatedness. For example: Latent semantic analysis, TF-IDF, Latent Dirichlet allocation, Graphs and more