BigQuery | CXL Review
Gain insights with real-time and predictive analytics
The course duration is two hours, and it’s divided into six lessons. The goal of this course is to teach you to consolidate your data in one place and automate your marketing reports. It will help you develop the skillset to become a data-driven marketer. You will learn more about Google Cloud and its possibilities and discover the full analytics process from data collection to data visualization.
Table of content:
- Introduction to the Google Cloud Platform
- Introduction to the Google BigQuery
- Get your data into BigQuery
- Work with data in BigQuery
- Query and analyze your Google Analytics data with BigQuery
- Build automated reports and dashboards using your BigQuery data
Introduction to the Google Cloud Platform
Google Cloud consists of a set of physical assets (computers and hard disk drives) and virtual resources (virtual machines) stored in Google’s data centers worldwide.
The Cloud Console is the graphical interface you can use to create and manage BigQuery resources and run SQL queries.
Google Cloud uses a pay-as-you-go pricing model, which means you only pay for the services you use.
The list of available Google Cloud services is too long to be listed here (also, they keep extending it). For data analytics, there are:
- Cloud Data Fusion, and others.
Introduction to the Google BigQuery
The BigQuery sandbox gives you free access to the power of BigQuery within the sandbox’s limits. The BigQuery sandbox is available to any Google Cloud customer.
Note that the BigQuery sandbox has some limitations:
- All BigQuery Quotas and limits apply.
- You have the same free usage limits as the free tier: 10 GB of active storage and 1 TB of processed query data per month.
- All datasets have the default table expiration time and the default partition expiration set to 60 days. Any tables, views, or partitions in partitioned tables automatically expire after 60 days.
Also, be aware that Sandbox projects do not support:
- Streaming data
- Data Manipulation Language (DML) statements
- The BigQuery Data Transfer Service
Upgrade your project and remove or update the expiration time for your resources to eliminate the sandbox limits. Let me remind you that after upgrading from the sandbox BigQuery Quotas and limits still apply.
Get your data into BigQuery
Google Services for BigQuery data import
- BigQuery Data Transfer Service: Google Play, Google Ads (2.5$ per month per customerID), Youtube (5$ per month per channel)
- Google Analytics 360
- Cloud Storage (no charges, only for storage)
- Google Cloud Dataflow (pipeline to write data to BigQuery)
- Google Cloud Function
Also, you can do BigQuery data import from the Local Machine.
Load data from a readable data source:
- You can load data manually
- You can use API or Client Libraries (create connectors or use paid connectors)
API is a set of clearly defined methods of communication among various components — applications, software
Batch task automation without manual work
BigQuery Data Transfer
Load data from Google (except Analytics) and Amazon sources:
- In BigQuery go to Transfers
- Choose data source
- Fill in all the data source and data destination information
- Connect to your Google Account
- You can delete/pause your data transfer at any time
- Check Run History to make sure that everything works well
BigQuery data streaming
Load data immediately without delaying (for real-time reports):
- You can create your custom solution
- Use Google solutions (Google Dataflow)
- Or use paid solutions
Use-cases: mobile application — sending errors events in real-time
BigQuery DataFlow and pub/sub
Pub/Sub is a fully-managed real-time messaging service that allows you to send and receive messages between independent applications. Cloud Pub/Sub Documentation
Google Cloud Dataflow is a cloud-based data processing service for both batch and real-time data streaming applications. It enables developers to set up processing pipelines for integrating, preparing, and analyzing large data sets, such as those found in Web analytics or big data analytics applications. What is Google Cloud Dataflow?
Streaming into BigQuery:
- Create Cloud Pub/Sub topic
- Read from Pub/Sub in Dataflow
- Compute windowed aggregates
- Stream into BigQuery
Work with data in BigQuery
To work with data in BigQuery, you will have to learn some basic SQL language
SQL is a programming language that gives you the possibility to communicate with your data.
If you are not familiar with SQL, you will have to dedicate some time to master it. Here is the video that can help you start:
Query and analyze your Google Analytics data with BigQuery
Google Analytics 360 and Google Analytics 4 give you the possibility to store your data in Google BigQuery. If you have a Google Analytics standard version (known as Universal Analytics), you will need to use external connectors to pull your data from Google Analytics to BigQuery.
A partitioned table is a table divided into segments, called partitions, that make it easier to manage and query your data. By dividing a large table into smaller partitions, you can improve query performance and control costs by reducing the number of bytes read by a query.
Denormalization is a strategy that database managers use to increase the performance of database infrastructure. It involves adding redundant data to a normalized database to reduce certain types of problems with database queries that combine data from various tables into a single table.
Build automated reports and dashboards using your BigQuery data
To create your reports in Google Spreadsheet, you have two possibilities:
- Use a natural data connector with BigQuery (available only for G Suite users)
- Use an add-on OWOX BI
If you have a paid version of Google Drive, you can connect to BigQuery directly from your Google Spreadsheet:
- Click on Connect to BigQuery,
- Select a project for which the billing is activated,
- You will see the query editor where you can edit a query,
- Click on Insert results.
Visualize your BigQuery data in Google Data Studio
Thanks to this course, you will master consolidating your data in one place and automating your marketing reports. By learning more about Google Cloud solutions you will become a competent data-driven marketer.
CXL BigQuery course will help you discover the full analytics process, from data collection to data visualization.