Export Google Analytics Raw Data

Sumit Mudliar
Electrik.AI
Published in
8 min readOct 1, 2020

Why you need raw unsampled data from Google Analytics

Google Analytics is an awesome tool which has become the standard for web analytics tools. Google Analytics provides valuable insights about an online business. Be it the ease of setup or the large variety of out-of-the-box reports and dashboards. Add to that the ability to build segments, funnels, track goal conversions. Awesome Right!

However, there are limitations in Google Analytics Data that prevents you from digging deeper. Google Analytics only makes aggregated data available from the API. Effectively, you end up downloading a Google Analytics report with each request. This makes actions like segmenting users by behavior using machine learning tools challenging. This especially so because, as pointed out earlier, you get limited by the granularity of data being exported via Google Analytics API.

Key limitations of Google Analytics Data exported via Google Analytics API.

Google Analytics Limitations

Data Sampling: Even Google’s processing servers can’t always handle endlessly large volumes of data in a finite amount of time. Thus, Google Analytics applies sampling when you request a large amount of data.
This data sampling is different from the limit of 10 million records for hits or events, per property, per month.

Aggregated Data: Google Analytics API does not provide access to hit level data, While you can find out the number of visitors in a particular segment on a particular day on your website and fetch a variety of metrics for those users (such as source/medium, browser, or session duration), you cannot get the underlying, user-level data to follow an individual user’s journey.

Fragmented Data: Google Analytics limits the number of dimensions and metrics you can include in a Google Analytics API request. Not every metric can be combined with every dimension. Each dimension and metric has a scope: user-level, session-level, or hit-level.

Fortunately, the answer to all these problems is raw data collected at hit level. Having Hit-level data means that you can access the underlying hits that were sent to Google Analytics, allowing you to do your own aggregation as you wish, based on any criteria or dimension.

How to export raw data from Google Analytics

As mentioned before, you can’t get raw data at hit level from Google Analytics. The premium version of Google Analytics (360) and its BigQuery export feature will get you closer but for $150,000 it is an awfully expensive option.

What If I tell you, there is a way to access raw data for each hit from Google Analytics for free. Yes, that’s correct it can be achieved using API functionality together with custom dimensions in Google Analytics.

Custom dimensions can be used to capture, analyze, and visualize information that is not available in Google Analytics by default. You can use custom dimensions as keys for combining information from GA and other systems, as well as to enhance your reports with information that is relevant to your business. For example, you can save the User Login ID from your website and use it for integrating offline and online actions.

Some useful custom dimensions, that improve your Google Analytics data collection:

1. Hit Timestamp : a hit-scoped custom dimension that captures the exact timestamp when the hit happened, in the yyyy-mm-ddThh: mm: ss format with the timezone offset.

2. Session ID : a session-scoped custom dimension that collects a unique, random value, used to identify hits that belong to the same session.

3. Client ID : a session-scoped custom dimension that collects the unique value assigned to the client’s device from the _ga cookie.

4. User ID : a hit-scoped custom dimension that collects the value representing a user who has logged in to your website, allowing you to identify all the sessions and hits of a user.

All you need is a Java Script to send these custom dimensions to Google Analytics along with the data that is already being sent from your website to Google Analytics. Easy Right?

Check out this awesome post by Simo Ahava on improving data collection with custom dimensions and Google Tag Manager for more details.

Now comes the difficult part, exporting this data out of Google Analytics. You will need an understanding of how APIs work, a Google Analytics account, and a Data Warehouse of your choice. The steps involved in loading the data from Google Analytics to a data warehouse are as follows:

Export Raw Data from Google Analytics to Data Warehouse

Step 1: Identify Your Data

The first step is to identify the right dimension and metric combinations allowed by Google Analytics API. Lucky for us there is a tool provided by Google Analytics which makes it easy. Along with the dimension and metrics combination you also need to choose the time period you want to pull the data, the trick is to make multiple requests for a small time period rather than making one request for longer time period.

Step 2: Extract Your Data

Now that you have identified the data you want to export, you can use the Google Analytics Reporting API to export data out of Google Analytics. This would involve making multiple requests with different combinations of dimension and metrics. However, the common dimension in all your requests sent to Google Analytics would be the custom dimension we discussed earlier in this post.

Step 3: Transform You Data

You must first transform your data to ensure that it is in a format that can be accepted by your data warehouse. For example, it will be easy to use a JSON format for Google BigQuery but you may have to choose to convert to a CSV or SQL format for more traditional relational databases like Microsoft SQL Server. But the most important step is to join the data exported by multiple requests based on common custom dimensions to get a composite record with hit level data of a user.

Step 4: Create a Data Receiving Repository in Your Data Warehouse

Creating a data stage for your data could make your data transformation easier to perform before it is finally ingested for analysis/reporting. This is easy to create in data warehouses like Google BigQuery or Snowflake.

Step 5: Load Your Data

It is advisable to design a schema for your chosen data warehouse and then map it to your Google Analytics data. In this way, you are almost ready to load your raw data from Google Analytics to a data warehouse after making sure that all the steps are completed to suit your needs.

Congratulations, you have now exported raw data from Google Analytics to Data Warehouse. The raw data will look something like this.

Client Id : 1802577120.1595862941

Hit Timestamp : 2020–07–27T10:15:41.267–05:00

Hit Date : 2020–07–27

Session Id : SID-20200727–08324964

Visitor Id : VIDc7ee42d2-c22f-690f-548a-61c5efdbbddd

Hashed IP Address : 05d11e92511d7f7b1bcda8327b855aa79f30cf3631f309d98dc78b84d79e5c16

Hit Type : pageview

Hit Order : 1

Pageview Order : 1

Property Id : UA-34208182–4

View Id : 207093576

Ad Group : GA Data Extract

Ad Query Word Count : 5

Ad Slot : Google search: Top

Ad Targeting Type : Keyword

Ad Group Id : 85404108936

Ad Campaign Id : 8247077736

Ad Creative Id : 452094560289

Ad Criteria Id : 301206554611

Ad Customer Id : 9227698748

Channel Grouping : Paid Search

City : Elk River

Continent : Americas

Country : United States

Country ISO Code : US

Latitude : 45.3377

Longitude : -93.5691

Metro : Minneapolis-St. Paul MN

Region : Minnesota

Sub Continent : Northern America

Exit Pagepath : /google-analytics-hit-data-extractor

Hostname : electrik.ai

Landing Pagepath : /google-analytics-hit-data-extractor

Pagepath : /google-analytics-hit-data-extractor

Page Title : Google Analytics Hit Level Data Extractor | Electrik.AI

Previous Pagepath : (entrance)

Browser : Chrome

Browser Size : 1580x760

Browser Version : 84.0.4147.89

Data Source : web

Device Category : desktop

Operating System : Windows

Operating System Version : 10

Ad Content : Export Google Analytics Data

Campaign : Evergreen_GA Hit Data Extractor Cmpgn

Full Referrer : google

Social Source Referral : No

Keyword : +export +google +analytics +data

Medium : cpc

Source : google

Source Medium : google / cpc

Days Since Last Session : 0

User Type : New Visitor

and more…

Awesome Right! But this does not end here, you need run this process daily to export hit level data from Google Analytics and you must keep in mind that technologies like Google Analytics are evolving and you might find what was working yesterday might not work today. Trust me we have been following this space closely.

So far we have just scratched the surface on how you can export raw data from Google Analytics. It gets even more complicated when you integrate data from different marketing sources with Google Analytics. So instead of building and maintain your own solution or paying $150,000 for Google Analytics 360. Try Electrik.AI’s Google Analytics Hit Data Extractor.

Wrapping Up…

Give Hit Level Data a try, you would be surprised with the amount of depth it adds to your marketing data analysis.

Using Electrik.AI, marketing professionals with no programming experience can export raw data from Google Analytics at hit level granularity in few minutes. You can view the list of all dimensions/metrics exported from Google Analytics here.

Electrik.AI’s Google Analytics Hit Data Extractor, uses Google Analytics to track raw hit level data on your website and exports Google Analytics Data to any Data Warehouse of your choice. Along with raw un-sampled Hit Level Data you also get the following in your data exported from Google Analytics.

  1. Hashed IP Address of the Visitor on your website.
  2. Unique Visitor ID for each user on your website.
  3. Unique Session ID for each period a user is active on your site.
  4. Client ID created and assigned by the Google Analytics cookie.
  5. Order of Pages viewed by a user in a session.

--

--

Sumit Mudliar
Electrik.AI

Transforming ideas into reality through code. Driven by purpose, fueled by curiosity. Always learning and growing.