GA4 with Snowflake is now ready for production.

Maeda Kentaro
Snowflake Engineering
4 min readFeb 5, 2024

On January 29, 2024, Snowflake announced the release of a preview version of the official connector for GA4, making it easier to analyze GA4 data within Snowflake.

Snowflake announced the release of a preview version of the official connector for GA4

Introducing this connector greatly reduces the operational costs associated with ingesting GA4 data into Snowflake.

This release signifies a significant improvement in how GA4 data can be used in Snowflake, making it suitable for production environments.

Before this, importing GA4 data into Snowflake was possible, but maintaining the process took time and effort.

This article looks at the traditional data ingestion methods compared to the new official connector, focusing on four key improvements:

  • Simplified process for ingesting GA4 data.
  • Reduced redundant data transformations after ingestion.
  • Significantly lower costs for data ingestion.
  • Official support for ingesting GA4 data.

Additionally, we’ll explore some specifics of the connector’s preview version and highlight some important limitations.

1. Simplified Data Ingestion

Integrating GA4 data into Snowflake has become easier.

Now, it’s straightforward to follow the steps in the Snowflake GUI wizard to complete the setup.

Before: Previously, data ingestion required writing custom scripts in Python or using third-party SaaS solutions like Fivetran or Airbyte. These methods were often complex, involving significant infrastructure and failure management, or expensive due to the need to manage some credentials.

After: The GA4 connector removes the need for custom scripts and infrastructure management, authentication is handled within Snowflake, and there is no need to incur external SaaS costs. The connector runs within Snowflake, with charges based only on using the virtual warehouse during ingestion.

2. Reduced Need for Data Transformations

Analyzing GA4 data in Snowflake is now more straightforward, with no need for manual parsing of semi-structured data.

Before: Manual parsing of large volumes of semi-structured data was required, which was a tedious and time-consuming task, even with tools like dbt.

After: Users can access parsed tabular data directly, the GA4 connector provides predefined view for parsing, but raw data table (VARIANT) is also available.

The table for raw data automatically created by the connector.
The predefined view for parsing the GA4 raw data table.

3. Significantly Lower Costs

The cost of ingesting GA4 data into Snowflake has been drastically reduced, offering up to 99.94% savings compared to previous methods.

Before: Using external SaaS for data ingestion was expensive. For instance, ingesting about 23 million rows per month with Fivetran cost around $4157.

After: The current cost is approximately $2.85 for processing around 24 million events, with expenses solely attributed to the computational resources of Snowflake tasks. In my environment, a warehouse of size XS can ingest data from about 24 million events every hour.

4. Official Support

Direct integration with Snowflake reduces both the maintenance costs and the risks associated with relying on third-party services.

Before: Users had to deal with changes from three different sources: GA4, ETL SaaS, and Snowflake, which could complicate maintenance.

After: With Snowflake’s direct support for GA4, only two sources of changes remain, simplifying maintenance and reducing costs.

Key Points of the GA4 Connector’s Specifications

When using Snowflake for the first time, please note that the GA4 connector:

  • It is not available on trial accounts, meaning it cannot be available on accounts created within the first 30 days.
  • It is not available if the Snowflake account is created in a GCP region

Currently, the data types that can be imported through the GA4 connector are as follows:

  • Raw data of all events
  • Report data after event aggregation

Specifications in Preview

  • Import of raw data is available for accounts on AWS.
  • Import of report data is available for accounts on AWS, Azure.
  • Creation of a GCP service account is required.
  • Raw data is imported via BigQuery.
  • Data transfer charges may apply depending on the region where the BigQuery data is located.
  • By default, synchronization is performed every 8 hours. The interval can be reduced to a minimum of 15 minutes.

Attention

  • Only one instance of the connector can be installed per Snowflake account.
  • Currently, one service account can be linked to one connector.
  • It is possible to import data from multiple websites (GA4 properties).
  • Universal Analytics is not supported.
  • It can take nearly 24 hours for GA4 data to sync with BigQuery, so patience is important. Once on BigQuery, synchronization can be done from 15 minutes to 8 hours.

Summary

The year 2024 may see Snowflake becoming increasingly central in web analytics❄️

This article is based on the official documentation.

--

--

Maeda Kentaro
Snowflake Engineering

RAKUDEJI inc. CEO | SnowPro Advanced : Data Engineer❄️