Export from Universal Analytics to BigQuery : Zero Cost, Full Control đ
Update 1: Enhanced Pagination and Data Volume Handling
Weâve made some significant enhancements to our data retrieval process to ensure that limitations such as page size and data volume donât hinder your data analysis. Now, you have the flexibility to set start and end dates for your data without any restrictions. Our updated code seamlessly handles pagination, preventing any limit exceedances.
Update 2: Introducing New Version of this Solution:
Continuous Analytics Bridge
Exciting news! Weâve recently released a groundbreaking solution that goes beyond just unifying data from Universal Analytics (UA) and Google Analytics 4 (GA4). Our Continuous Analytics Bridge (CAB) offers a more advanced and comprehensive approach to data integration, covering all backfills from UA and GA4, and even beyond. CAB is designed to cater to a wide range of analytical needs, providing you with unparalleled insights and opportunities. I believe youâll find CAB to be a game-changer. To learn more about CAB and its capabilities, I encourage you to check out my Medium blog or visit our product page
Did you know that the deadline for Universal Analytics (UA) is coming up? This deadline is like a ticking time bomb for your data! By July 2024, UA is saying a final goodbye, taking all its precious data with it. Imagine years of insights, customer journeys, and those sweet conversion stats just disappearing into thin air. Scary, right?
But fear not! Weâre not about to let our data vanish like a ghost in the digital night. Thatâs why Iâm here with a Python script that will save your data. Curious? You should be!
The Trouble with Manual Extraction
Ever tried filling a bucket with a tiny spoon? Thatâs what manual data extraction feels like. Youâre there, clicking and downloading Excel files from Universal Analytics, one by one. It takes forever, and the worst part? You might miss out on some really important stuff. Itâs like leaving behind pieces of a puzzle â you never get the full picture.
So, why isnât manual extraction the best idea?
- Too Slow: Itâs like walking when you could be flying. You spend hours or even days and still donât get all you need.
- Missing Pieces: Important bits of your data can get lost. Itâs like only hearing half a story.
- Limited Analysis: With Excel, you canât easily handle large-scale analysis or run complex queries on your data. Itâs like having a sports car but only being able to drive in first gear.
The Downsides of Third-Party Services
Now, Consider hiring someone else (such as Supermetrics or Datalayers) to do this task for you. thatâs what third-party services are like. They promise to make things easier, but thereâs always a catch. Itâs like ordering a pizza but not being able to choose the toppings.
Whatâs not so great about these services?
- Costly: They can be expensive, like buying a ticket to a concert that youâre not sure you want to go to.
- Not Flexible: They have rules about what data you can move and how. Itâs like being on a road trip but not being able to choose the route.
- Security Worries: Letting someone else handle your data can be risky. Itâs a bit like giving your house keys to someone you donât know very well.
Setting Up and Using the Script: A Detailed Walkthrough
Step 1: Set Up Your BigQuery Account
First off, ensure you have a Google Cloud account with BigQuery. Also select a project on dataset in a location that is ok for export and you can use them with other data. Keep them in mind and use next steps in selected projec. This will be the new home for your Universal Analytics data.
Step 2: Enable the Google Analytics Reporting API
To allow the script to communicate with Google Analytics, you need to enable the relevant API:
- Directly go to this link to access the Google Analytics Reporting API in the Google Cloud Console.
- Click to enable it. This action is akin to giving your script a key to access the Analytics data.
Step 3: Create a Service Account and Download the JSON Key
For secure access to BigQuery and Analytics, youâll use a service account:
- In Google Cloud Console, navigate to âIAM & Adminâ > âService Accounts.â
- Either create a new service account or use an existing one, ensuring it has âOwnerâ access.
- In the service account details, go to the âKeysâ section.
- Click on âAdd Keyâ and choose to create a new key. Select JSON as the key type and download it.
This JSON key is crucial and should be kept secret â itâs like the password for your script to access your data.
Step 4: Grant Access to Google Analytics
Next, your service account needs permission to fetch data from Universal Analytics:
- In your Google Analytics account, add the service accountâs email (found in the service account details) to Property Access Management. âviewerâ access is sufficient.
- This step is like giving a trusted friend access to your valuable data.
Step 5: Accessing and Configuring the Script
Accessing the Code
To begin, youâll need to access the script. Head over to GitHub and navigate to the Universal-Analytics-to-BigQuery repository. Here, youâll find the script named backfill-UA.py
. This is the script you'll be using to migrate your data.
Filling in the Configuration
Once youâve accessed the script, itâs time to configure it with your specific details. Look for the section in the script that starts with âConfiguration variables for Google Analytics and BigQuery.â Hereâs a breakdown of what you need to fill in:
- SCOPES: This should already be set to [âhttps://www.googleapis.com/auth/analytics.readonly'] and typically does not need to be changed.
- KEY_FILE_LOCATION: Enter the file path where your downloaded JSON key file is located. This is the key you created earlier for your service account.
- VIEW_ID: Insert your Google Analytics View ID here. You can find this in your Google Analytics account under Admin â View Settings.
- BIGQUERY_PROJECT: Enter the ID of your Google Cloud Project. This is where your BigQuery resides.
- BIGQUERY_DATASET: Type in the name of your BigQuery Dataset where you wish to store the data.
- BIGQUERY_TABLE: Lastly, put in the name of the BigQuery Table where your data will be stored. If it does not exist, it will be created.
Make sure you input these details correctly to ensure the script functions properly. Here is a sample:
# Configuration variables for Google Analytics and BigQuery
SCOPES = [âhttps://www.googleapis.com/auth/analytics.readonly']
KEY_FILE_LOCATION = â/content/ga4-awz-a6942c278eef.jsonâ # Path to your Google Cloud service account key file
VIEW_ID = â238782913â # Your Google Analytics View ID
BIGQUERY_PROJECT = âga4-awzâ # Your Google Cloud Project ID
BIGQUERY_DATASET = âUA_backfillâ # BigQuery Dataset name where the data will be stored
BIGQUERY_TABLE = âbackfillâ # BigQuery Table name where the data will be stored
Step 6: Running the Script
With your configuration set, youâre now ready to run the script. Execute backfill-UA.py
, and the script will start its process:
- Initiating the Connection: The script begins by establishing a connection to the Google Analytics Reporting API using the credentials you provided.
- Fetching Data: Next, it retrieves the necessary data from Universal Analytics based on the metrics, dimensions, and date ranges youâve specified.
- Data Conversion and Upload: The data is then converted into a format compatible with BigQuery (a Pandas DataFrame) and is subsequently uploaded to your BigQuery table.
Once the script completes its execution, your Universal Analytics data will be securely transferred and stored in BigQuery, ready for you to analyze and utilize as needed. The following is an example of what the results look like:
Customizing the Script for Specific Analytical Needs
Fine-Tuning Metrics and Dimensions
To get the most out of the script, itâs crucial to adapt it to your unique analytical requirements. This means tweaking the metrics and dimensions in the get_report
function to align with what you're looking to analyze.
Exploring Metrics and Dimensions
Within the get_report
function of the script, you'll find sections dedicated to metrics and dimensions. These are essentially the categories and types of data you want to pull from Universal Analytics. Think of metrics as the 'what' (like sessions, pageviews, users) and dimensions as the 'who', 'where', and 'how' (like country, browser, page title).
Leveraging Documentation for Choices
To make informed decisions about which metrics and dimensions to include, refer to the Universal Analytics to GA4 Reporting API documentation. This resource offers a comprehensive list of available options, providing you with a broad palette to paint your data picture.
Customization in Action
In the script, the sections for metrics and dimensions might look like this:
'metrics': [
{'expression': 'ga:sessions'},
{'expression': 'ga:pageviews'},
# ... other metrics
],
'dimensions': [
{'name': 'ga:country'},
{'name': 'ga:pageTitle'},
# ... other dimensions
],
Here lies your opportunity to tailor. To track different aspects, simply replace or add items in these lists.
The Trade-Off Between Detail and Granularity
Itâs important to remember that adding more dimensions can lead to more detailed reports but may compromise the granularity, reducing the number of rows of data you get. Conversely, fewer dimensions can increase the granularity, offering a broader view of the data.
Adjusting the âpageSizeâ parameter
The âpageSizeâ parameter at the end of the get_report
function is also adjustable. It controls how many rows of data you receive in each request:
'pageSize': 20000 # Change as needed
Adjusting Date Ranges
Another customizable aspect is the âdateRangesâ section. The default setting might be:
'dateRanges': [{'startDate': '365daysAgo', 'endDate': 'today'}],
Feel free to modify the âstartDateâ and âendDateâ values to suit the time frame youâre interested in analyzing.
Your Ticket to Easy Data Migration
Ready to shift your data from Universal Analytics to BigQuery? This script is your secret weapon â easy to use and customizable to your needs.
Got an idea to make it better? Spotted a bug? Every bit of feedback helps, and your contributions are key to making this tool the best it can be. Hats off to Rakesh Mohandas for starting this awesome journey with their initial script.
Feel like chatting about it or got a cool suggestion? Letâs connect on LinkedIn. Together, we can make data migration not just easy, but also fun!