Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Understanding the GA4 BigQuery Export Schema and Structure

A qualitative investigation into one of the weirdest data structures ever forced upon millions of innocent, unsuspecting analysts

11 min readJun 21, 2024

--

Press enter or click to view image in full size
The data you want is definitely there somewhere, you just have to figure out how to UNNEST it… Photo by Dean Ward on Unsplash

Introduction

Google Analytics 4 is currently estimated to be used by 15.6 million websites in the world, meaning that the BigQuery GA4 Export is possibly one of the most widely exported data schemas of all time. Google Analytics 4 data is accessible through the web user interface or directly via the API into Looker Studio, but if you want to:

  • Own your data beyond Google’s retention policy,
  • Archive your data to prevent possible data loss,
  • Combine your data with other internal data sources,
  • Augment your data from additional external sources, APIs or LLMs, or
  • Build custom automation workflows

then the recommended approach is to enable the GA4 export to BigQuery, which is very simple to set up and configure.

Great! Job done!

Not quite. Once the data starts appearing in daily BigQuery exports, you may notice that the structure is somewhat irregular, making it very difficult to work with directly. This…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Jim Barlow
Jim Barlow

Written by Jim Barlow

Senior Data Engineer @ Data to Value

Responses (1)