Natural Language Processing: benchmarking providers

Published in

Botfuel

5 min readNov 29, 2017

At Botfuel, we have been working since early 2016 with large companies to design and to build state-of-the-art enterprise-grade chatbots. Our technology is now ready to go public:

our Natural Language Processing (NLP) web services, our QnA Builder and our Analytics platform are already available on our developer portal
our open-source bot-building SDK, Botfuel Dialog is available on our GitHub page

As we are rolling out our NLP web services, we wanted to test and compare what we have built with what is already on the market, and share the results.

Our approach to NLP

Chatbots automate conversations, allowing users to use natural language to express needs, e.g. to buy a product or to search for information. Therefore, if we want chatbots to meet expectations, Natural Language Processing technology is fundamental. At Botfuel, we worked very early on two services providing spell checking and entity extraction.

Moreover, we believe that, for NLP to achieve the best possible results, it has to be tailor-made for the context in which it is used. This is why our spell checker allows an easy customization of the dictionary to include business-specific vocabulary.

The entity extractor supports the main European languages (English, French, German, Spanish, Italian) and is able to extract up to 31 different entities.

Test methodology

We ran 3 tests in July 2017.

Entity extraction
We have compared our Entity Extractor with :

Microsoft LUIS
Snips
Recast.ai
Api.ai
Wit.ai
Amazon Lex
IBM Watson

To test them, we relied on 20 sentences (in French and English). These sentences assess the ability of the services to extract information about :

Time and time intervals
Addresses
City names
Forenames
Volumes/areas and distances
Large numbers
Percentages
Money count

Although most companies provide an API in order to directly test the entity extraction, API.ai, Snips and Lex do not. Hence we built intents specific to each sentence. In order to have our extraction running, we had to build intents for : API.ai, Snips and Lex. For Recast, Wit.ai, IBM and LUIS, we had to create a bot and add to it all the built-in entities.

Spell checking
We have compared our spell checker with BING’s and X-Spell’s. To do it we relied on a few sentences, both in English and French and used an error distance equal to 2. This means that the words processed by the spell checkers in the test may contain up to two mistakes each.

Combination of spell checking and entity extraction
We wanted to test the ability of chatbots to extract entities despite misspelled words. We must choose entities every bot can recognize. Ten sentences were tested in both languages (English and French). Each sentence contains a “time” entity which is misspelled. We eliminated IBM from the bench because it doesn’t have sufficient results for “time” entity recognition as it didn’t recognize any “time” entity in our previous tests. We tested the seven other solutions on English sentences.

Results
You can find the full results here.

Entity Extraction

Below is a table synthesizing the results of our benchmark related to entity extraction.

Let’s now look at some of the sentences:

English
With the following English sentence: “I need to rent a car for two weeks near Berlin”, we expect to detect the 3 entities:

Item-count
Time Interval
Location

The results are synthesized in the following table, in this particular example, nobody gets the perfect score.

French
With the following French sentence: “Aujourd’hui le cours du dollar a chuté de 15%, pour atteindre 1 dollars et 27 cents pour 1 euro”, we expect to detect the 3 entities:

Time
Percentage
Money

The results are synthesized in the following table, in this particular example also, nobody gets the perfect score.

Spell checking

Following is a table synthesizing the results of our benchmark related to spell checking.

Let’s look now at some more detailed results.

English
In English, Microsoft Bing and X-Spell always perform better than our own spell checker. This is not the case in French.

French
For example, the following French sentence “Je veux parier sur un amtch amical Paris-Lyno”, is correctly spell checked by Botfuel and X-Spell “Je veux parier sur un match amical Paris-lyon” but not by Bing which introduces a new mistake “Je veux parler sur un match amical Paris-lyon”.

Spell checking and Entity Extraction

Following is a table synthesizing the results of our benchmark combining spell checking and entity extraction.

Let’s now look at some of the sentences:

Conclusion

The current benchmark was conducted to analyze how Botfuel’s technology compares to the other main players of NLP services in the market.

Of course, these tests and the methodology used has some limits. The sample dataset we used is small and specific to some use cases that we believe are frequent in chatbot building. A larger test with random sentences may have led to different results.

As NLP evolves quickly, we will keep on testing our technology against others’ and will share with you the results.