Data Modeling For Question Answering

Tim Tutt
Tim Tutt
May 18, 2018 · 4 min read
Image for post
Image for post

Liquid Gold

The concept of turning data into actionable intelligence is one that has been around for quite some time now. When properly leveraged, data can be used to effectively print money. Businesses know this, so they make tremendous investments in acquiring data and hiring scores of data scientists and data engineers along with the various technology stacks to support them.

Religious wars are fought in the tech community around what technologies are best for dealing with data. One dev screams “Spark Forever!” while another lives and breathes by good old relational databases. Business leaders could care less as long as they can ask the right questions to derive value from their data.

Time and time again we see the latest technology stack fail to live up to the hype. And so for the business leader, the quest for the holy grail of data continues and they move to the next technology stack.

In most of these failure cases, however, the issue is not the tech stack at all. The issue is that the data was not properly modeled so that the right questions could be asked.

Sure, the performance was phenomenal and all of the data was in the system, but the questions that needed to be asked could not be asked because of choices made during the data modeling phase.

Start With The Questions

In order to succeed and reach the nirvana that business leaders covet, all data modeling efforts must start with the questions. What questions will business analysts and business leaders want to ask of the data to make decisions to drive the business forward:

  • “Which region had the worst sales performance last quarter?”
  • “What marketing channels result in the highest number of conversions?”
  • “How many devices are on my network that do not meet compliance requirements?”
  • “What segment of customers spends the most in the winter months?”

When you start with the questions you can determine whether or not you even have all of the data you need. Are there reference datasets you need to acquire? Do you need to blend two different datasets? How do you join them together? Every question that may be asked influences how the data needs to be modeled.

Data should never be ingested into a system until you know what the questions are and how to model that data to answer those questions.

Technology Choices Matter

When deciding how data should be modeled your technology choices actually do matter. As with everything — you need to choose the right tool for the job. There is another set of questions that arises when making these decisions, and they are all rooted in the business questions being asked:

  • “Does this answer need to be returned in real-time or overnight?”
  • “Should the answers be exactly right or approximate?”
  • “Do you need older data combined with new to get the right answers?”
  • “How often does your data change? Is it a transactional system?”
  • “Is disk storage a concern? How about compute power? Memory?”

Each of these questions can drive you down a particular path. Normalize or Denormalize? Fast or Batch? Big or Small? The list goes on, but at the end of the day these are all rooted in what the business questions are.

Follow The Path

By following the paths laid out by the business questions you can avoid (at least some of) the technology debate and you’ll be able to provide the answers that your business leaders are looking for in their data.

Remember, technology is a tool to get a job done. In the case of data, the job is question answering. If you model your data to answer the right questions, you can actually turn data into actionable intelligence with which you can print money.

Shameless Plug

ClearQuery was designed to translate natural language questions into queries. Those queries only work if the data is modeled in such a way that gives us the ability to get those answers out. We start with the questions and make sure the data is modeled to provide the answers businesses are looking for.

Interested in learning more about how ClearQuery can help your business get answers to your questions? Email us at info@clearquery.io or visit our website for more.

ClearQuery

ClearQuery turns natural language questions into…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store