BigQuery at Retail: Create an hourly sales summary from the raw sales data

Jun Kaneko
2 min readSep 10, 2018

--

I want to share some of my learning while implementing the data dashboard for a retail business using Google BigQuery and Data Studio.

Let’s assume that we have a set of following sales data.

  • transactions
  • transaction_items

The “transactions” table stores individual purchase transactions from customers, and the “transaction_items” table stores individual items in the transactions.

For example, when you buy the following items at the store, each of these items will be recorded as an entry in the “transaction_items” table.

  • Two bottles of water (£1 x 2 qty)
  • One pack of crisps (£1.5 x 1 qty)

and the total sales of £3.5 would be recorded as an entry of the “transactions” table.

I pulled these sales data from the Point Of Sales (EPOS / Till) system and set to sync with Google BigQuery periodically.

Big Query Schema for the transactions

Here is the example schema of the “transactions” table.

and “transaction_items” table.

You may have realised that these two tables are denormalised, for example, “shop_name” and “invoiced_at” columns exist in both tables. It is actually recommended to make BigQuery perform better.

Although BigQuery charges you with the size of data you query, you would not be penalised by the redundant columns as long as you select only the specific columns you need in the query.

Summarise the sales transactions

Uploading loads of data to BigQuery is a pain. But once you have done, it is amazingly easy and fast to run the query on the powerful Google data centre.

Let’s start by summarising all the transactions into hourly sales figures. BigQuery provides various SQL functions to compile the data.

Here, we use:

Continue to “Create and delete a table from the Node client”.

--

--

Jun Kaneko

Based in London and Devon, I love surfing and technology.