DAAS Labs
Published in

DAAS Labs

Trying out Data QnA on BigQuery and Google Sheets

Watch as my natural language queries get translated into SQL

The BigQuery team dropped a surprise — natural language querying in BigQuery! Because it’s still in private alpha, you may not be able to try it out yourself. So … next best thing, watch me try it out:

How to use Data QnA

This is what I did:

  • Enable the Data QnA feature (I didn’t show you this, but it’s from the IAM console, similar to any other Google Cloud feature)
  • If the table you want to query is already in your own project, you are done. Otherwise, create a view in your project to reflect the table you want to query.
  • Go to the Data QnA part of the console and create a table.
  • Correct metrics/dimensions if needed (I didn’t need to)
  • Add synonyms (I added cases and county instead of confirmed_cases and county_name)
  • Click Save
  • On the BigQuery console, click “Ask a question”
  • Ask a question, generate sql, run the sql. Try a few natural language queries (see below).
  • Open sheets.new
  • Connect to the BigQuery table through the Data | Data Connectors menu
  • Ask a question or two in sheets.

Data QnA is in alpha, so if you want to try it out, ask your friendly Google Cloud rep to get your project whitelisted.

Autogenerated SQL

Data QnA takes natural language text and autogenerates the SQL. Here are a few examples on the Covid dataset. In each case, I am listing the natural language question and the autogenerated SQL.

Q: Which county has most number of cases

SELECT
county_name AS county_name,
(SUM(confirmed_cases)) AS SUM_confirmed_cases
FROM
`ai-analytics-solutions.advdata.covid19_usafacts`
GROUP BY county_name
ORDER BY SUM_confirmed_cases DESC
LIMIT 10;

Q: how many cases in kansas in March

SELECT
(SUM(confirmed_cases)) AS SUM_confirmed_cases
FROM
`ai-analytics-solutions.advdata.covid19_usafacts`
WHERE
state = 'KS' AND (date BETWEEN '2020-03-01' AND '2020-03-31');

Q: which state and county had most cases in last 7 days

SELECT
state AS state,
county_name AS county_name,
(SUM(confirmed_cases)) AS SUM_confirmed_cases
FROM
`ai-analytics-solutions.advdata.covid19_usafacts`
WHERE
(date BETWEEN '2020-07-03' AND '2020-07-09')
GROUP BY state, county_name
ORDER BY SUM_confirmed_cases DESC
LIMIT 10;

Q: how many counties are named Washington

SELECT
(COUNT(*)) AS COUNT__ROWS_
FROM
`ai-analytics-solutions.advdata.covid19_usafacts`
WHERE
county_name = 'Washington';

Q: which state’s washington county had least number of cases

SELECT
state AS state,
(SUM(confirmed_cases)) AS SUM_confirmed_cases
FROM
`ai-analytics-solutions.advdata.covid19_usafacts`
WHERE
county_name = 'Washington County'
GROUP BY state
ORDER BY SUM_confirmed_cases
LIMIT 10;

Q: cases by month in Kansas

SELECT
TIMESTAMP(FORMAT_DATE('%Y-%m-%d', DATE_ADD(DATE_ADD(DATE_TRUNC(SAFE_CAST(date AS DATE), MONTH), INTERVAL 1 MONTH),
INTERVAL - 1 DAY))) AS date,
(SUM(confirmed_cases)) AS SUM_confirmed_cases
FROM
`ai-analytics-solutions.advdata.covid19_usafacts`
WHERE
state = 'KS'
GROUP BY date
ORDER BY date;

Enjoy!

--

--

--

Daas labs focuses on Data analytics using emerging tech. It deals in the areas of natural languages, Artificial Intelligence, Computer vision & AR & VR. DAAS Labs is also building it’s smart suit of products. It works with leading companies across the globe.

Recommended from Medium

Excalidraw: How to Bring Complex Data Science Ideas to Life in Sketches

DataFest 19. My takeaways

Smart, Responsible, and Upper Caste Only: Measuring Caste Attitudes through Large-Scale Analysis…

Are you Democrat or Republican? Let your tweets define you …

Recognition and Counting of Microorganisms on Petri Dishes

The accelerated failure time (AFT) model

RockingData Lake for Social Good

Reactive event and Python backend driven Bokeh charts in Angular

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Lak Lakshmanan

Lak Lakshmanan

Operating Executive at a technology investment firm; articles are personal observations and not investment advice.

More from Medium

BigQuery: Soundex function and UTMs reducing hardcoding on spelling errors

Cloud Data Fusion: Update deployed pipelines through REST API

Materialized views in BigQuery

Merge on BigQuery tables with Nested & Repeated fields