BigQuery at Retail: Create an hourly sales summary from the raw sales data
I want to share some of my learning while implementing the data dashboard for a retail business using Google BigQuery and Data Studio.
Let’s assume that we have a set of following sales data.
- transactions
- transaction_items
The “transactions” table stores individual purchase transactions from customers, and the “transaction_items” table stores individual items in the transactions.
For example, when you buy the following items at the store, each of these items will be recorded as an entry in the “transaction_items” table.
- Two bottles of water (£1 x 2 qty)
- One pack of crisps (£1.5 x 1 qty)
and the total sales of £3.5 would be recorded as an entry of the “transactions” table.
I pulled these sales data from the Point Of Sales (EPOS / Till) system and set to sync with Google BigQuery periodically.
Big Query Schema for the transactions
Here is the example schema of the “transactions” table.
and “transaction_items” table.
You may have realised that these two tables are denormalised, for example, “shop_name” and “invoiced_at” columns exist in both tables. It is actually recommended to make BigQuery perform better.
Although BigQuery charges you with the size of data you query, you would not be penalised by the redundant columns as long as you select only the specific columns you need in the query.
Summarise the sales transactions
Uploading loads of data to BigQuery is a pain. But once you have done, it is amazingly easy and fast to run the query on the powerful Google data centre.
Let’s start by summarising all the transactions into hourly sales figures. BigQuery provides various SQL functions to compile the data.
Here, we use:
- DATETIME_TRUNC and TIMESTAMP_TRUNC to truncate the time of the transactions and group into hourly figures.
- FORMAT_DATETIME to store the day of the transactions in a human-readable format.
- The reason for MOD() the day number is to bridge the difference between how BigQuery and DataStudio handle the numbers of the days in a week.
Continue to “Create and delete a table from the Node client”.