NLP for Anyone Who Builds Products (Part 2 of 2)

Masoud
Data Science and Machine Learning at Pluralsight
9 min readAug 20, 2020
source here

Great shout-out to Levi Thatcher for reviewing the post and providing feedback.

TL;DR: Through a non-technical explanation of NLP techniques, this post aims to enable leaders, product managers, and whoever builds products to discover NLP opportunities in their business problems and be conversational about NLP.

In part 1 of this series I addressed section 1 and 2 of the below table. In this part, I am going to cover the other categories.

3. User Intention and Requests

This section is related to the part 1 in many ways — NLP is mostly about extracting knowledge from unstructured text 😃, so no matter how I categorize the topics there will be overlaps between them. The reason I separated this topic from the previous section is that the techniques that I will talk about are more around “user intention” and “user requests” and the ways that we can address those requests.

I have already mentioned Sentiment Analysis in part 1, which is one of the use cases of user intent detection. Another NLP classification technique is Stance Detection which is the extraction of a user’s reaction to a claim made by a primary actor. Let’s say a user’s response to the tweet “Climate change is a real problem” is this: “We need to work hard to address this issue”. The Stance Detection model can tell us that the stance of the response is “in favor” of the post and not “against” it. User reviews are amenable for this type of analysis and we can use stance detection as a complement to sentiment analysis.

Intent Detection Models try to capture user intent from a given text. While analyzing user reviews, say you want to identify the ones asking questions about the quality of products. An intent detection model can label each review with different predefined intentions and you can further analyze the reviews that have labels “question.prodcut_quality”. Intent detection is usually applied on text that has a “request” format and it is considered as the first step for building Dialogue Systems (Chatbots). You might not want to build a whole chatbot experience in your organization from scratch but it is good to know that chatbots try to categorize the user utterances to some classes using stance and intent detection and then provide appropriate responses based on the context of that class.

Semantic parsing is one of the main components of dialogue-based systems, but since I promised no to delve into the details of model buildings let’s only touch the highlights and see where we can use it. Semantic parsing model translates unstructured information (text) into a machine understandable representation on which a machine can act. This representation can be something like a SQL query that is run over structured data (database). So, the goal is to retrieve data from a DB based on the user request in a textual format. Here is an example. Let’s assume your organization owns a database with a lot of jobs data and your query text is “What jobs at Google require nltk skill?”. Semantic parsing model can convert the query to a SQL format like this: “SELECT * FROM jobs WHERE company = ‘google’ AND ‘nltk’ IN skills”. This query can be executed over a relational database and return the results. Based on your business problems, you can create different semantic parsing models to answer different queries such as “show me flights from Boston to Chicago” or “What Apple products do you sell?”.

The final technique that we are going to discuss in this section is answering the user request based on a knowledge buried in the text. Question Answering (QA) models receive a question and a context that contains information necessary to output the desired answer. The difference between QA model and semantic parsing is that in QA you are trying to extract information from a text given the query. However, in semantic parsing the data is extracted from structured databases and not free-text. As you see in the image below, the answer of question is found from the given passage.

AllenNLP QA model

4. Text generation and translation:

Text generation refers to the task of generating text based on given prompts. Auto-Complete in the search box would be the most common use case of this technique. Completing the query can de done by a Language Models (LM) which tries to predict the probability of the next words after your query. These probabilities are learned during a training process through which the model is exposed to a lot of textual documents where words come after each other one by one. In the past LMs were trained based on statistical algorithms but due to the recent advancements in Deep Learning new LMs have much better accuracy than statical models. LM is one of the integral parts of modern natural language processing and they play a crucial role in building machine translation, speech recognition, question answering and sentiment analysis models. So, if you are thinking about building Spelling Correction — which I am going to discuss in the next section — or auto-complete for your search engine, you are going to hear a lot about LMs.

Text Generation is not just about predicting the next word in your sentence. They can create a full article based on a simple prompt. Language models like GPT3 can create a whole fiction based on a single sentence! And if you fine-tune LMs for downstream tasks, they can do many other things like Machine Translation or Summarization. Translation is one of the well known techniques of NLP and everybody can get a feeling of it by asking Google to translate a piece of text. However, general translators like Google Translate, may have difficulty in translating domain specific text which have their own nuances and terminologies. Translators can be adapted and fine-tuned based on the training data in your specific domain. So, if you are interested in building a context sensitive translator, start with language models.

As I mentioned above, you can also fine-tune LMs to create a Summarization model. A good use case for this model is when your customers don’t have time to read a log text and you want to create a TL;DR version of it. Summarization can be extractive (identifying important sections of the text and joining a subset of the sentences) or abstractive (generating a summary of text after examining the whole text by applying advanced NLP techniques). No matter which method you use the goal is to get a more concise version of the original text. Automatic summarization can be used in use cases such as generating a short description for a product, summarizing news feeds, or enabling you to skim through the summary of your daily emails.

Another use case for text generation is Data-to-Text Generation or basically converting structured data to natural language format. Knowledge that is stored in databases or knowledge graphs are not easily usable by humans and sometimes it is needed to be transformed to human readable format. One great use case for Data-to-Text Generation models would be automatic generation of analysis report for business intelligence and analytics platforms. Charts and dashboards are amazing ways for monitoring the metrics and interacting with the analysis results, however, they can also be overwhelming and hard to understand. As shown in the image below, automatically generated data-to-text report would increase the readability of the dashboard. These models can also be used in news generation such as converting stock data into human readable text or creating short updates about sports results.

Source here

5. Text improvement

Text improvement is modifying text in a way that fits the need in particular use cases. First, let talk about Grammatical Error Correction which is obviously a need for better text understanding. Google does it on every single email that we compose and helps us not only fix our grammatical errors but also our Misspellings. In the past, rule-based models were used to identify grammatical or spelling errors in the text, however, these days language models outperform rule-based models. There are many use cases beyond just sending emails that can benefit from these techniques. For example, when users enter queries in the search box, Spelling Correction feature can help them to improve the relevancy of search results. Also, Grammatical Error Correction can be used to improve the quality of information extraction tasks. Your regex model might fail in identifying certain phrases in the text because it is sensitive to the grammatical structure of the text and fixing those grammatical errors in advance can solve the issue.

Another technique for improving text is identifying Missing Information in the text which is implicit but important for sentence understanding. The missing elements are easily filled in the sentence by humans but pose a challenge for computers. Consider the sentence “I’m 35” or “It’s worth about 2 million”. It is clear for humans that the former refers to the person’s age but not stated explicitly and the latter is talking about monetary value of something. But those missing information might not be clear for a machine. There are NLP models that can help you not only in identifying these missing elements but also filling those with proper words. In our examples, after filling the missing information by the model the sentences will look like this: “I’m 35 years old” or “It’s worth about 2 million dollars”. As I mentioned earlier, text improvement techniques are usually used as a preparation for other tasks such as information extraction and if you are doing text analysis on text with age-sensitive or currency-sensitive phrases using these techniques would improve the results.

Examples for text improvement

Text Normalization is the task of translating texts from a noncanonical domain (i.e. short text messages) to a more canonical domain (standard English). A common example would be transforming “2moro”, “pix”, and “b4” to “tomorrow”, “pictures”, and “before”. This technique can be applied on noisy texts such as comments, text messages and product reviews where abbreviations and misspellings are prevalent. Why would you care about this? Again, because improving text would help you to do the downstream tasks better. For example, if you want to do sentiment analysis over customer reviews you could apply this technique first in order to improve the results; here is a paper showing that text normalization improves the accuracy of Twitter sentiment analysis by 4%.

And finally, another way to improve text is simplifying it. Text Simplification is different from Summarization. Simplification tries to modify the content and structure of a text in order to make it easier to read and understand, while summarization only focuses on the length of text and tries to convey the message in a shorter text. A simplified version of a text could benefit people who are not experts in the domain or people who have low literacy like children or disabled people. In the above image you could compare the simplified version of a text with the original.

Conclusion

In these two post series (see part 1) I summarized some of the popular use cases of NLP and introduced them in a non-technical fashion. My hope is that these posts can help you to approach your business problems differently. If you enjoyed the post but you want to learn more, here is an amazing repository that Sebastian Ruder has created to track the progress of NLP research.

--

--

Masoud
Data Science and Machine Learning at Pluralsight

I’m a Principal Data Scientist at Pluralsight, where we’re democratizing tech skills.