Native Data Integration Series — Snowflake Connector for Google Analytics Aggregate Data

Kai Ni
5 min readJan 28, 2024

--

Snowflake continues to be at the forefront of innovation, and in this four-part series, I am going to discuss how Snowflake Connectors provide native integration of third-party applications and database systems in Snowflake. The connectors provide instant access to current data without the need to manually integrate against API endpoints. Data automatically refreshes, based on your desired frequency, in your Snowflake account. The connectors support both the initial load of historical data as well as incremental changes.

The latest addition to its suite of connectors is the Snowflake Connector for Google Analytics Aggregate Data, and the connector enables automatic ingestion of Google Analytics 4 (GA4) data into your Snowflake account. The Snowflake Connector for Google Analytics Aggregate Data extracts aggregated data using the GA4 Reporting API.

Google Analytics is a cloud-based tool that provides insight into how users interact with your website. You can use it to analyze user actions, track the number of visitors, page views, and analyze bounce rates for a page.

Key Features

  1. Cloud Support: Currently, the connector is supported in AWS and Azure clouds, ensuring flexibility for organizations with diverse cloud infrastructure.
  2. GA4 Compatibility: The connector exclusively ingests GA4 report data, aligning with the latest analytics standards.
  3. Date Dimension Requirement: The presence of the date dimension is essential in the report definition for successful data ingestion.
  4. Ingestion Modes: The connector operates in two modes — initial load and ongoing load. The initial load fetches data for synchronization, followed by regular incremental updates in the ongoing load phase.
  5. Parallel Report Updates: To enhance efficiency, the connector updates some reports in parallel, treating the ingestion of each report as an independent process.

Google Analytics Aggregate Data — Data Flow

The swim lanes in the diagram below represent the different systems the customer’s data will flow through. Starting in Google Analytics, Google customers typically set up one property per data source, for example, one property for their company’s website, one property for their company’s mobile app, etc. Whatever sources their users are interacting with will have a property in Google Analytics. All of their users’ interactions with the website or mobile app will be recorded as events in the corresponding Google Analytics property, all of which are put in an Event Store.

The Snowflake GAAD connector can then call this data from Google Analytics directly using the Google Analytics 4 API. In the Snowflake Connector, customers will create reports where they choose the aggregate metrics and dimensions that will act as columns in the downloaded data. Users can define multiple reports, each of which can tied to a different Google Analytics property.

GAAD Data Flow

Installation Step

The installation process is straightforward. Detailed instructions can be found here on Snowflake Documentation site, but here are some quick steps to get started:

To install the connector, do the following:

  1. Sign in to Snowsight as a user with the ACCOUNTADMIN role.
  2. In the left navigation, select Apps.
  3. In the Recently Shared with You section, select the tile for the Snowflake Connector for Google Analytics Aggregate Data.
  4. In the page for the Snowflake Connector for Google Analytics Aggregate Data, select Get.
  5. This displays a dialog box that you use to begin the initial part of the installation process.
  6. In the dialog box configure the following:
  7. In the Options->Application name field, enter the database to use as the database for the connector instance. This database is created for you automatically.
  8. In the Warehouse used for installation field, select the warehouse that you want to use for installing the connector. (Note: This is not the same warehouse that is used by the connector to synchronize data from Google Analytics. In a later step, you will create a separate warehouse for this purpose.)

Setting Up the Connector

Once installed, the connector requires configuration. You can refer to the setup instructions here to seamlessly integrate and optimize their data workflow, but to set up data ingestion using Snowsight, follow these steps:

  1. Sign in to Snowsight as a user with the ACCOUNTADMIN role.
  2. In the left navigation, select Apps.
  3. Search for the Snowflake Connector for Google Analytics Aggregate Data, then select the tile for the connector.
  4. In the Data sync section, click Add report.
  5. This displays a new dialog box. In this dialog box, fill in the following fields:
  • Report name: Identifier for a new report. Specify a name that is unique for your destination database, following the naming rules for unquoted object identifiers.
  • Property: Choose a Google Analytics property that holds the data you want to ingest. If a property is not available, ensure that the Google Analytics account used by the connector has access to it.
  • Dimensions: Select Google Analytics 4 dimensions that you want to appear in your report. Dimensions are attributes of your data.
  • Metrics: Choose Google Analytics 4 metrics that you want to appear in your report. Metrics are quantitative measurements of a report.
  • Sync data from: Set the start date for the initial load of data.
  • Sync schedule: Choose the sync frequency for the ongoing load of data.

6. Select Start Sync.

It can take a few minutes for the ingestion process to complete. The table and view with your report data will not be visible in the destination database until the data from GA is fully fetched.

Current Limitations (As of 2024–01–28)

  • Cloud Support and Regions: The connector is supported only in AWS and Azure clouds, with government regions excluded.
  • Single Instance Support: Only one instance of the connector is supported per Snowflake account. Multiple instances require separate accounts.
  • Role Requirements: Installation and configuration mandate the user to be logged in with the ACCOUNTADMIN role.
  • GA4 Exclusivity: The connector can retrieve data exclusively for GA4 properties, not supporting Universal Analytics (UA).
  • Data Latency: Due to Google Analytics data changes taking up to 72 hours to reflect, real-time updates are not currently supported.
  • Trial Account Exclusion: The connector is not supported with Snowflake trial accounts due to external access security concerns.

In conclusion, the Snowflake Connector for Google Analytics Aggregate Data unlocks a new realm of possibilities for organizations aiming to elevate their analytics game. As the connector evolves, it promises to bridge the gap between Google Analytics and Snowflake, ensuring a seamless and efficient data integration experience. Stay tuned for the next part of the Snowflake Connectors Data Integration series.

--

--