Intent Detection using Sequence Models

Deepak Pandita

Published in

Holler Developers

6 min readJun 16, 2021

Holler AI Lab at Holler Technologies

Google Nest Mini — A smart speaker from Google which can be controlled by voice — Photo by Ben Kolde on Unsplash

Introduction

Text Classification in Natural Language Processing (NLP) is the task of assigning a class from a set of predefined classes to a given piece of text (document/paragraph/sentence). Popular examples of text classification include spam classification (classifying an email into spam or non-spam), sentiment classification (classifying polarity of text into positive, negative, or neutral), intent detection (identify the intention of the user like play music, book restaurant, book flight, etc.), and news categorization (classifying news articles into categories such as business, sports, politics, science, tech, etc.). If the number of classes is two, the task is called binary text classification and if the number of classes is more than two, then the task is called multi-class text classification.

In this article, we focus on the problem of intent detection. We first introduce the task of intent detection and the dataset used. Next, we introduce the motivation behind using sequence models and show how they can be used to solve the task of intent detection. We develop our solution in Python using pandas, TensorFlow Keras, and scikit-learn libraries. We won’t delve into the technical explanations of machine learning concepts in the interest of time, however, we will learn how to use them along with a lot of code.

Intent Detection

Intent detection aims to recognize the intention of the user query i.e. an action that the user wants to perform e.g. “Play music on YouTube Music”, where the intent is to “play music”. The user query is a single sentence and can originate from a written or spoken utterance. Intent detection is a very crucial task in Natural Language Understanding and is usually modeled as a classification problem. Given an input sentence, the objective is to predict an intent for the sentence from a set of predefined intents.

Dataset

To solve this problem, we need a dataset containing utterances that are labeled with intent. For this article, we use the Snips dataset which is a widely used dataset for intent detection benchmarking. It consists of 14484 utterances across 7 intent types. The dataset is stored in a JSON file and we start by loading it into a pandas DataFrame and plotting the class distribution.

Data Preparation

As shown in Figure 1, the class distribution is balanced. If you’re working on a dataset with severe class imbalance then it might be helpful to balance the dataset. There are various techniques to tackle the class imbalance problem which are out of scope for this article. Now, we split our dataset into train and test sets to train a machine learning model and evaluate its performance. We split the dataset into 80% train and 20% test using the scikit-learn library. To make sure that we can replicate the results in the future, we also set the RANDOM_STATE variable.

(11587,) (2897,) (11587,) (2897,)

Sequence Models

Sequence models deal with supervised learning tasks where either model input or model output is a sequence. Sequences can be text, audio, video, temporal data, or any other sequential data. Examples of sequence modeling include sentiment classification, music generation, and machine translation. Recent advances in deep learning particularly in the area of sequence models have revolutionized the world of NLP, thereby establishing themselves as a dominant paradigm for training models for language understanding tasks. Sequence models like recurrent neural networks (RNNs) and transformers have consistently achieved state-of-the-art performance on a number of benchmark NLP tasks. In this article, we use an architecture related to RNNs, long short-term memory (LSTM) to solve the task of intent detection.

First, we need to prepare our input text for use in training. We tokenize the text and convert it into a sequence of integers by using the Tokenizer from Tensorflow Keras. Then, we pad the sequences to be of the same length for modeling as required by Keras.

(11587, 35) (2897, 35)

Next, we prepare one-hot vectors for labels by using the LabelEncoder and to_categorical function.

(11587, 7) (2897, 7)

Training a Sequence Model

Let’s define & train our model now.

LSTM

We define a Tensorflow Keras Sequential model and add layers to it. The first layer is the Embedding layer that is used to represent each word with a vector of fixed length 16. The next layer is the LSTM layer with 16 units with a relu activation. Next, we have a Dense layer with 7 units with a softmax activation for classification. Then, we compile the model with adam optimizer, categorical cross-entropy loss, and evaluate performance metrics like precision, recall, and accuracy. Finally, we fit our model on the training dataset with a batch size of 32 for 7 epochs. We also use 10% validation data for the evaluation of performance metrics at the end of each epoch. The fraction of the training data to be used for validation is specified using VAL_SPLIT.

As the training progresses, we can notice the precision, recall, and accuracy on the training set increasing. The same effect can be noticed in the accuracy on the validation data.

Plot Learning Curves

Once we fit the model, we look at the learning curves by plotting the loss function for the training data and the validation data.

In Figure 2, we notice the loss on training and validation data to be decreasing continuously and getting more stable with a small gap in-between, therefore, we can conclude that our model is not overfitting or underfitting and is a good fit for the data.

Evaluation

Performance on Test Data

Now, we evaluate the performance of our model on the test dataset.

91/91 [==============================] - 0s 3ms/step - loss: 0.1192 - precision: 0.9884 - recall: 0.9724 - accuracy: 0.9821

The model achieves a near state-of-the-art accuracy of 98.21% on the test set.

Classification Metrics

We also compute the performance metrics for each class to get a better estimate of model performance.

As shown in Figure 3, the model is performing well for all classes of intent.

Inference using a Sequence Model

Finally, we can use our trained model to perform inference.

play_music

The model correctly predicts the intent “play_music” for the user query “Play music on YouTube Music”.

Summary

In this article, we learned to use sequence models for intent detection. We introduced the task of text classification with the example of intent detection. We also introduced sequence models and the motivation behind using them for sequence classification tasks. We demonstrated how sequence models can be used to solve the task of intent detection by implementing an LSTM model using TensorFlow Keras in python.

A Jupyter Notebook containing the code can be found here.

Resources

If you’re interested in learning more, here are some resources:

References

Coucke, A., Saade, A., Ball, A., Bluche, T., Caulier, A., Leroy, D., Doumouro, C., Gisselbrecht, T., Caltagirone, F., Lavril, T. and Primet, M., 2018. Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces. arXiv preprint arXiv:1805.10190.
https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/

Thanks for reading! If you have questions or if you would like to see us write on anything, please drop a comment or reach out on my website.

About Holler

Holler is here to make your texts, posts, payments, and DMs more expressive. How? By suggesting the most relevant content — animated Stickers and GIFs– right when you need it the most in chat.