How Babelfish lets you use NLP Based Search in any data environment

Veer
babelfishing
Published in
3 min readFeb 9, 2018

BabelFish Data Search combines data relationship and NLP to offer easy to use, natural language based report generation, without having to depend on a data analyst for every report request.

Babelfish helps businesses drastically reduce the time taken to create custom reports, even when the data is distributed in silos. It allows anyone within the organization to simply ask anything in plain English to get answers or metrics in seconds, based on user permissions. The user can then pin these reports to a dashboard and share it with other business users.

Implementation
Babelfish Data Search can be simply and quickly integrated into the existing data environment in 3 easy steps:

Step 1 — Data Tagging
If the data is lying in silos, i.e. not unified, then we plug all data sources to tag relevant data entities using the tag manager.

If data is unified, we can jump to the next step.

Step 2 — Data Preparation
Based on the relationship, we generate a structure and map it with the NLP keyword list, created by the data prep team.

Step 3 — Access Control
User permissions are set, allowing them to get answers based on their access to key information.

Once Babelfish is enabled, the business can allow any user to ask and find answers and generate reports from the connected data. This saves an incredible amount of time, by eliminating the need for specialists to create custom reports over multiple silos. As data sources grow, all one need do is to add the new data entities to the relationship structure and update the keyword list.

Babelfish deployment is much faster than traditional BI tech, as it only requires the mapping of existing column names to hierarchical definitions defined in the Babelfish data model.

Babelfish’s proprietary Machine Translation Algorithm makes a huge difference in the way we perceive data discovery. The algorithm plays over a modular semantic model, which can be tweaked based on incoming datasets. Based on the incoming keywords, the algorithm will translate the natural language text to SQL query

This algorithm can virtually wipe out manual report customization, which is currently a very expensive affair. Businesses using software applications collect data in silos and end up spending more in integrating data and customizing reports. Babelfish’s algorithm can deliver any kind of report instantly and save up 80–90% of BI expenses.

Enterprises can implement Babelfish over their current data warehouses or data lakes, where data volume is huge and unstructured, and save millions of dollars spent in data discovery and visualization. It would require 3–4 weeks to organize data and set rules, which is a one-time affair. Whenever new data parameters are collected, data admin will manually tag the new parameters to the data model.

The Babelfish algorithm can be applied to large healthcare data, unstructured social data, telecom data or any ad-hoc data that lacks structure and needs to be queried using natural language.

--

--