Comparing the Leading Natural Language Processing Technologies

Published in

Version 1

6 min readJan 6, 2021

In 2019, Version 1 Innovation Labs developed a Proof of Value (PoV) in the area of Document Analytics for a major public sector customer. Developed using open source software, this solution used Artificial Intelligence (AI) and Natural Language Processing (NLP) technology to gain insights into unstructured documents by analysing their content, without human intervention. An interactive demo of this solution is available on the Version 1 website.

However, as major cloud partners for both AWS and Azure, we are fully aware of the relevant NLP solutions provided by each. To that end, we created a report that details the “State of the Art” in NLP for Azure and AWS and provides a comparison between them and the solution developed by Version 1 — “Version 1 Smart Text”. Each solution was tested using the same test dataset, with the results of each detailed and compared from different perspectives: feature availability, accuracy, response times, cost, limitations and lastly data retention and usage policies.

The analysis is detailed in a report (available at this link) recently published by Version 1 Innovation Labs. The main findings are summarised in this post.

NLP definition and main terminology

NLP

Put simply, NLP is a technology with “the ability to turn text or audio speech into encoded, structured information, based on an appropriate ontology” (Gartner, 2020).

Optical Character Recognition (OCR)

To perform text analytics/NLP, OCR is quite often required to convert images of text or handwriting into text. This is an important pre-cursor to language understanding. An important consideration for the quality of OCR is the Dots Per Inch (dpi). The golden standard for this value is 300 DPI which is considered the default for text documents (Cochran, 2014). Another consideration is that OCR works on images. If a PDF or Word Doc is provided, it may be necessary to firstly convert to image and then perform OCR. This can impact the processing time.

Sentiment Analysis

This is the ability of a computer program to judge the tone of the text to understand if the general sentiment is Positive, Negative, Neutral etc. This can be useful in many circumstances such as call centre analysis and performance improvement, chatbots, call/query prioritisation

Entity Recognition

Entity Recognition relates to the ability of a computer program to automatically extract entities such as names, company names, countries etc. This can also be extended to recognise your organisation specific entities.

Regular Expression Extraction

This is an extension of Entity Recognition which allows you to extract entities which follow a custom pattern. For example, tax numbers, emails, invoice numbers etc.

Semantic Search

Provides an ability to determine if keywords or synonyms of those keywords are mentioned in a document/message. Useful when you want to identify documents / messages which mention a concept.

Key Phrase Extraction

Key phrase extraction is the ability to automatically extract the key phrases and terms mentioned in the given document/message.

Topic Modelling

Topic Modelling is an extension of Key Phrase Extraction which identifies the key phrases which most accurately describe a topic. This allows you to classify your corpus of documents/messages into a set of topics and understand which key phrases most accurately represent each topic. This information can then be used to figure out the topics mentioned in new documents, thus allowing you to more accurately classify and understand these documents.

Document Classification

Where Topic Modelling uses an unsupervised approach to automatically determine the topics or classifications, Document Classification uses a supervised Machine Learning approach whereby the model learns what a classification (or topic) looks like by being fed only with documents of that classification. This is a supervised Machine Learning approach, compared to Topic Modelling which is unsupervised.

Language Detection

Automatically determine the languages used.

Syntax Analysis

This functionality categorises each word in the text according to its syntactical role: pronoun, verb, adjective, and so on.

The comparison

Version 1 Smart Text, Azure and AWS OCR and NLP technologies were tested with the same dataset, and where possible, the same configuration.

Full detail of the comparison presented in this report. Below a quick summary of our findings.

Service Availability Comparison

The feature availability is summarised in the following table.

Winner: No clear winner. It depends on which capability you require.

2. OCR comparison

Winner: Azure Read API. By a long way the quickest, it also appears to perform best for many scenarios. However, if your requirement is for OCR of printed text, then all three solutions will perform competently.

3. Supervised Document Classification comparison

Winner: Azure Cognitive Services do not present any Document Classification functionality. Both AWS and Version Smart Text performed similarly well. However, if the processing time is taken into consideration, Version 1 Smart Text can be considered as the winner here as the response time is near real-time.

4. Entity Recognition comparison

Winner: Accuracy-wise, Version 1 Smart Text is better. Each of these services provides some different set of entities, with Azure providing sub-category too. It depends on the requirements of the use-case to select which of these services would best suit.

5. Sentiment Analysis comparison

Winner: AWS provided higher accuracy

6. Limits Comparison

Winner: Version 1 Smart Text has no character, page or transaction limitations.

7. Cost comparison

A detailed cost analysis is presented in the report. Below, a table comparing the cost of OCRing and running the services previously introduced on the same set of documents.

Winner: No clear winner. All three solutions are relatively low cost and we do not see cost as being the primary driver for choice. This might change if you wanted to process particularly high volumes, in which case the Version 1 Smart Text solution would make sense.

8. Data Retention and Usage Policy Comparison

Winner: Version 1 Smart Text as your data is not retained and used for other purposes such as training of AI models.

Conclusions

This blog presents a very quick summary of the tests the Innovation Labs carried out for comparing different OCR and NLP technologies. I highly recommend anyone interested in the subject, to read the full report available at this link.

The key takeaways are summarised below:

Advancements in Deep Learning have enabled NLP capabilities to come a long way in the last couple of years, enabling clear use cases for organisations to adopt.
AI models can perform better than humans in understanding and answer questions on documents.
This creates a massive opportunity for organisations to gain deep and actionable insights into documents and text, removing the need for staff to analyse.
There is no clear winner, no one-size-fits-all. Which solution you require depends on your requirements.
There is a set of overlapping capabilities provided by all solutions, however, each also provide additional unique services.
The solution built by the Version 1 Innovation Labs, which uses open source frameworks, models and technology, is comparable in performance and features to the cloud providers.
An interactive demo of the Version 1 Smart Text solution is available on the Version 1 website.

Comparing the Leading Natural Language Processing Technologies

NLP definition and main terminology

The comparison

Conclusions

Written by Filippo S.