Understanding Self-Service Analytics on BigQuery with Cloud Dataprep

Bertrand Cariou
Google Cloud - Community
3 min readSep 2, 2019

This article outlines the basic concepts I learned while exploring Google Cloud Platform to set up an analytic framework for my own reporting needs. I wanted to have the ability to build quickly ad-hoc and recurring reports on large data sets to get the necessary insight to inform some business decisions (and share it with the team) or feed data for some marketing campaigns I was running.

I am pretty happy with the result and want to share here a summary of what I learned so others can get to similar results faster. Hope it can be useful to you too.

What is self-service analytics?

For me, the objective of self-service analytics is to empower data-driven professionals like me to create their own end-to-end solution for their analytics, respond to their decision-making requirements, and research for insight. Self-service means that there are no necessary dependencies on others to get the job done and that the technology in use is simple yet powerful and scalable enough to be easily adopted by anyone to deliver results quickly.

What are the ingredients for self-service analytics on the Google Cloud Platform (GCP)?

To establish a self-service analytics practice, you need to rely on three fundamental constituents: data storage, data preparation, and data visualization.

Data Storage. You need a space that your data can live and grow within. This data storage is the foundation to construct your analytics on top of. Within the GCP, the easiest way to store and retrieve your data for analytics at scale is BigQuery.

Data Preparation. From the data storage layer, you can prepare your data for reporting and dashboarding. You’ll be using Cloud Dataprep to clean, standardize, combine, and create various calculations and metrics stored in BigQuery so that you can design your visualization layer on top of it. Data

Visualization. The visualization layer shows your data in the form of various reports and dashboards containing graphs and tables to visually represent data for your insight and to help your data-driven decision process. You’ll be using Data Studio consuming BigQuery data to create your reports and dashboards. With these fundamental elements in place, you’re ready to implement a scalable and comprehensive self-service analytics solution.

Understand GCP self-service analytics concepts

Here are the basic and minimum concepts for an end-to-end self-service analytics solution built on the GCP. These concepts are the foundation you need to understand and iterate on for your self-service analytics solution.

GCP Concepts for self-service analytics

Here is a visual representation of these concepts and their relations to create a self-service analytics solution within the GCP.

flow between concepts

Moving from theory to practice

Still a bit too theoretical? No problem! You can experience it for yourself following this step-by-step guide to build your first end-to-end analytics solution. You will establish a scalable self-service analytics solution that leverages BigQuery to store and retrieve data; Cloud Dataprep to clean, combine, and create metrics; and Google Data Studio to visually report on top of your data. By exploring each GCP service deeper and iterating on these principles, you will be able to solve any requirements for your analytics in a self-service manner.

Originally published on www.trifacta.com on July 24, 2019.

--

--