Talking To Your Business Data? Babelfish Speaks Your Language.

Veer
babelfishing
3 min readSep 25, 2018

--

The Importance of Business-Centric NER

How do you get more people in your organization to work with data? Is it possible for non-technical personnel to query databases, generate reports and even assign tasks to machines? If people are able to communicate with data in simple English, think of the time and money saved by not having to learn complicated operating tools.

The answer lies in implementing Natural Language Processing (NLP): a system that can convert natural language to machine query and revert back with natural language responses. However, for NLP to work effectively, it is important to consider how NLP components fit into the business ecosystem. It is crucial that the machine is trained to identify named entities (Named Entity Recognition-NER)that are pertinent to the business, understand semantic roles (Semantic role labeling)that matter with business data and extract dependencies before meaning is extracted from a given sentence and converted to a database query.

While there are a few popular tools like Google, Amazon, IBM Watson, Microsoft and AllenAI that offer NER as a service, this post highlights the importance of custom NER over generic solutions. Because achieving optimum results over business data sources requires a specialized skill set.

Let’s first take a look at how the usual suspects offer named entity detection for a given sentence.

The following URLs allow you to test a sentence for popular named entities like person, location, organization, etc.

https://cloud.google.com/natural-language/

http://demo.allennlp.org/named-entity-recognition

In our recent implementation, we found that the available off-the-shelf tools do not work well over business data as the named entities in the business context are not generic in nature.

For example, consider the following queries

Example 1
Show me products purchased by Unique Services

Example 2
Show me products sold by Kevin Sanders

In the first example, ‘Unique Services’ is the name of a partner company, which goes undetected when we tokenise ‘unique’ and ‘services’ separately

In the second example, though the APIs detects it as person, it does not detect whether it is a customer or a sales employee. Without this detection, there is a chance that the meaning of the sentence will be wrong

Off-the-shelf NER services work better on text databases, which in business scenarios can relate to data from customer chat, call center voice-to-text logs and other, where you would like to detect the person and possibly the sentiment in that sentence

However, there is so much more to business data. The named entity list needs to classify between customers, employees and partners, as well as maintain a list for product names, locations, email addresses, URLs and other. Each of these entities also needs to be detected by its sub-classification. For example, the NER needs to detect if the name in the sentence is a customer name, and if yes, further detect whether she is a prospect or a buying customer.

This is where Babelfish comes into its own. Babelfish’s proprietary NLP application automatically creates an NER list from all the connected data sources. It does this by extracting keyword metadata and semantically organizing them in a hierarchical fashion. This is a process that is customized to each business as it requires detecting entities that are unique to every organization. Such a degree of custom classification might not be available in tools offering generic NER solutions.

Without a proper NER, it becomes even more painful to detect, assign roles and parse relationships, which are subsequent steps, once the named entity is recognized. In other words, not identifying a keyword, might not find a role, where meaning can change of a given sentence.

An NLP solution that uses a business-centric NER system is key to unlocking the power of business data; and putting it in the hands of anyone who speaks simple English.

--

--