Export from Universal Analytics to BigQuery : Zero Cost, Full Control 🆓

ali izadi
8 min readJan 12, 2024
Google Universal Analytics data being removed in July 2024
Universal Analytics data being removed in July 2024

Update 1: Enhanced Pagination and Data Volume Handling

We’ve made some significant enhancements to our data retrieval process to ensure that limitations such as page size and data volume don’t hinder your data analysis. Now, you have the flexibility to set start and end dates for your data without any restrictions. Our updated code seamlessly handles pagination, preventing any limit exceedances.

Update 2: Introducing New Version of this Solution:
Continuous Analytics Bridge

Exciting news! We’ve recently released a groundbreaking solution that goes beyond just unifying data from Universal Analytics (UA) and Google Analytics 4 (GA4). Our Continuous Analytics Bridge (CAB) offers a more advanced and comprehensive approach to data integration, covering all backfills from UA and GA4, and even beyond. CAB is designed to cater to a wide range of analytical needs, providing you with unparalleled insights and opportunities. I believe you’ll find CAB to be a game-changer. To learn more about CAB and its capabilities, I encourage you to check out my Medium blog or visit our product page

Did you know that the deadline for Universal Analytics (UA) is coming up? This deadline is like a ticking time bomb for your data! By July 2024, UA is saying a final goodbye, taking all its precious data with it. Imagine years of insights, customer journeys, and those sweet conversion stats just disappearing into thin air. Scary, right?

But fear not! We’re not about to let our data vanish like a ghost in the digital night. That’s why I’m here with a Python script that will save your data. Curious? You should be!

The Trouble with Manual Extraction

Ever tried filling a bucket with a tiny spoon? That’s what manual data extraction feels like. You’re there, clicking and downloading Excel files from Universal Analytics, one by one. It takes forever, and the worst part? You might miss out on some really important stuff. It’s like leaving behind pieces of a puzzle — you never get the full picture.

So, why isn’t manual extraction the best idea?

  • Too Slow: It’s like walking when you could be flying. You spend hours or even days and still don’t get all you need.
  • Missing Pieces: Important bits of your data can get lost. It’s like only hearing half a story.
  • Limited Analysis: With Excel, you can’t easily handle large-scale analysis or run complex queries on your data. It’s like having a sports car but only being able to drive in first gear.

The Downsides of Third-Party Services

Now, Consider hiring someone else (such as Supermetrics or Datalayers) to do this task for you. that’s what third-party services are like. They promise to make things easier, but there’s always a catch. It’s like ordering a pizza but not being able to choose the toppings.

What’s not so great about these services?

  • Costly: They can be expensive, like buying a ticket to a concert that you’re not sure you want to go to.
  • Not Flexible: They have rules about what data you can move and how. It’s like being on a road trip but not being able to choose the route.
  • Security Worries: Letting someone else handle your data can be risky. It’s a bit like giving your house keys to someone you don’t know very well.

Setting Up and Using the Script: A Detailed Walkthrough

Step 1: Set Up Your BigQuery Account

First off, ensure you have a Google Cloud account with BigQuery. Also select a project on dataset in a location that is ok for export and you can use them with other data. Keep them in mind and use next steps in selected projec. This will be the new home for your Universal Analytics data.

Step 2: Enable the Google Analytics Reporting API

To allow the script to communicate with Google Analytics, you need to enable the relevant API:

  • Directly go to this link to access the Google Analytics Reporting API in the Google Cloud Console.
  • Click to enable it. This action is akin to giving your script a key to access the Analytics data.
Enable Analytics Reporting APi in BigQuery
Enable Analytics Reporting API in BigQuery

Step 3: Create a Service Account and Download the JSON Key

For secure access to BigQuery and Analytics, you’ll use a service account:

  • In Google Cloud Console, navigate to “IAM & Admin” > “Service Accounts.”
  • Either create a new service account or use an existing one, ensuring it has ‘Owner’ access.
  • In the service account details, go to the “Keys” section.
  • Click on “Add Key” and choose to create a new key. Select JSON as the key type and download it.

This JSON key is crucial and should be kept secret — it’s like the password for your script to access your data.

JSON key for service Account

Step 4: Grant Access to Google Analytics

Next, your service account needs permission to fetch data from Universal Analytics:

  • In your Google Analytics account, add the service account’s email (found in the service account details) to Property Access Management. “viewer” access is sufficient.
  • This step is like giving a trusted friend access to your valuable data.
Grant Access in UA
Grant Access in UA

Step 5: Accessing and Configuring the Script

Accessing the Code

To begin, you’ll need to access the script. Head over to GitHub and navigate to the Universal-Analytics-to-BigQuery repository. Here, you’ll find the script named backfill-UA.py. This is the script you'll be using to migrate your data.

Filling in the Configuration

Once you’ve accessed the script, it’s time to configure it with your specific details. Look for the section in the script that starts with ‘Configuration variables for Google Analytics and BigQuery.’ Here’s a breakdown of what you need to fill in:

  1. SCOPES: This should already be set to [‘https://www.googleapis.com/auth/analytics.readonly'] and typically does not need to be changed.
  2. KEY_FILE_LOCATION: Enter the file path where your downloaded JSON key file is located. This is the key you created earlier for your service account.
  3. VIEW_ID: Insert your Google Analytics View ID here. You can find this in your Google Analytics account under Admin → View Settings.
  4. BIGQUERY_PROJECT: Enter the ID of your Google Cloud Project. This is where your BigQuery resides.
  5. BIGQUERY_DATASET: Type in the name of your BigQuery Dataset where you wish to store the data.
  6. BIGQUERY_TABLE: Lastly, put in the name of the BigQuery Table where your data will be stored. If it does not exist, it will be created.

Make sure you input these details correctly to ensure the script functions properly. Here is a sample:

# Configuration variables for Google Analytics and BigQuery
SCOPES = [‘https://www.googleapis.com/auth/analytics.readonly']
KEY_FILE_LOCATION = ‘/content/ga4-awz-a6942c278eef.json’ # Path to your Google Cloud service account key file
VIEW_ID = ‘238782913’ # Your Google Analytics View ID
BIGQUERY_PROJECT = ‘ga4-awz’ # Your Google Cloud Project ID
BIGQUERY_DATASET = ‘UA_backfill’ # BigQuery Dataset name where the data will be stored
BIGQUERY_TABLE = ‘backfill’ # BigQuery Table name where the data will be stored

Step 6: Running the Script

With your configuration set, you’re now ready to run the script. Execute backfill-UA.py, and the script will start its process:

  1. Initiating the Connection: The script begins by establishing a connection to the Google Analytics Reporting API using the credentials you provided.
  2. Fetching Data: Next, it retrieves the necessary data from Universal Analytics based on the metrics, dimensions, and date ranges you’ve specified.
  3. Data Conversion and Upload: The data is then converted into a format compatible with BigQuery (a Pandas DataFrame) and is subsequently uploaded to your BigQuery table.

Once the script completes its execution, your Universal Analytics data will be securely transferred and stored in BigQuery, ready for you to analyze and utilize as needed. The following is an example of what the results look like:

Backfilled UA data to BigQuery
Backfilled UA data to BigQuery

Customizing the Script for Specific Analytical Needs

Fine-Tuning Metrics and Dimensions

To get the most out of the script, it’s crucial to adapt it to your unique analytical requirements. This means tweaking the metrics and dimensions in the get_report function to align with what you're looking to analyze.

Exploring Metrics and Dimensions

Within the get_report function of the script, you'll find sections dedicated to metrics and dimensions. These are essentially the categories and types of data you want to pull from Universal Analytics. Think of metrics as the 'what' (like sessions, pageviews, users) and dimensions as the 'who', 'where', and 'how' (like country, browser, page title).

Leveraging Documentation for Choices

To make informed decisions about which metrics and dimensions to include, refer to the Universal Analytics to GA4 Reporting API documentation. This resource offers a comprehensive list of available options, providing you with a broad palette to paint your data picture.

Customization in Action

In the script, the sections for metrics and dimensions might look like this:

'metrics': [
{'expression': 'ga:sessions'},
{'expression': 'ga:pageviews'},
# ... other metrics
],
'dimensions': [
{'name': 'ga:country'},
{'name': 'ga:pageTitle'},
# ... other dimensions
],

Here lies your opportunity to tailor. To track different aspects, simply replace or add items in these lists.

The Trade-Off Between Detail and Granularity

It’s important to remember that adding more dimensions can lead to more detailed reports but may compromise the granularity, reducing the number of rows of data you get. Conversely, fewer dimensions can increase the granularity, offering a broader view of the data.

Adjusting the ‘pageSize’ parameter

The ‘pageSize’ parameter at the end of the get_report function is also adjustable. It controls how many rows of data you receive in each request:

'pageSize': 20000  # Change as needed

Adjusting Date Ranges

Another customizable aspect is the ‘dateRanges’ section. The default setting might be:

'dateRanges': [{'startDate': '365daysAgo', 'endDate': 'today'}],

Feel free to modify the ‘startDate’ and ‘endDate’ values to suit the time frame you’re interested in analyzing.

Your Ticket to Easy Data Migration

Ready to shift your data from Universal Analytics to BigQuery? This script is your secret weapon — easy to use and customizable to your needs.

Got an idea to make it better? Spotted a bug? Every bit of feedback helps, and your contributions are key to making this tool the best it can be. Hats off to Rakesh Mohandas for starting this awesome journey with their initial script.

Feel like chatting about it or got a cool suggestion? Let’s connect on LinkedIn. Together, we can make data migration not just easy, but also fun!

--

--

ali izadi

Marketing Data Analyst and Engineer with 3+ years in tech. Specializes in data, automation, team collaboration and continuous improvement