Google Analytics Multi-Channel Funnel Data Extraction using API in Python

Sheranga Gamwasam
Analytics Vidhya

--

In the previous article, I have explained on how to extract the google analytics data using python because Google Analytics is used by digital marketers, data analysts and digital analysts to get some insight by viewing reports to grow their business.

Google Analytics standard reports are based on last cookie basis which means they give credit to the last channel that user has converted. But a typical website visitor goes through multiple channel paths (organic, paid search, social media, banners, etc) before making a conversion. Therefore to view the journey of a website visitor we use Multi-Channel Funnel report.

Today I will address the steps to be followed to extract the data from Multi-Channel Funnel through an API in Python.

First step is to create a project in google developer console and obtain the client id, client secret and also enable the Google Analytics Reporting API and Analytics API.

Video link to demonstrate how to create a project (Create a project in developer console)

The next step after obtaining the client id and client secret is authentication. This step involves obtaining permission from the gmail account that has access to google analytics accounts. Thereafter Google will be providing a unique refresh token. Using the unique refresh token, Google analytics data belonging to the gmail account can be obtained.

import the following library

from oauth2client.client import OAuth2WebServerFlow
from oauth2client.tools import run_flow
from oauth2client.file import Storage
import json
import os
import re
import httplib2
from oauth2client import GOOGLE_REVOKE_URI, GOOGLE_TOKEN_URI, client
import requests
import pandas as pd

following function is used to check if the file exist in the directory

'''function check whether file exist in the path or not'''

def where_json(file_name):return os.path.exists(file_name)

following function is used to get the refresh token

''' function return the refresh token '''

def get_refresh_token(client_id,client_secret):
CLIENT_ID = client_id
CLIENT_SECRET = client_secret
SCOPE = 'https://www.googleapis.com/auth/analytics.readonly'
REDIRECT_URI = 'http:localhost:8080'

flow = OAuth2WebServerFlow(client_id=CLIENT_ID,client_secret=CLIENT_SECRET,scope=SCOPE,redirect_uri=REDIRECT_URI)
if where_json('credential.json')==False:
storage = Storage('credential.json')
credentials = run_flow(flow, storage)
refresh_token=credentials.refresh_token

elif where_json('credential.json')==True:
with open('credential.json') as json_file:
refresh_token=json.load(json_file)['refresh_token']

return(refresh_token)

After executing the above functions we can obtain the refresh token by calling the following code. Here we use the client id and client secret that was obtained from the project when we created

client_id = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
client_secret = 'XXXXXXXXXXXXXX'
refresh_token=get_refresh_token(client_id,client_secret)

Now we have the refresh token to access the google analytics data. However to access the data we need one more parameter which is the view id of the particular google analytics account.

Now we have the client id, client secret, refresh token and view id with us. Multi-Channel Funnel data can be obtained by using the following function given below

''' function return the multi channel data for given dimension, metrics, start data, end data access token, transaction_type, condition'''

def google_analytics_multi_channel_funnel_reporting_api_data_extraction(
viewID,dim,met,start_date,end_date,refresh_token,transaction_type,condition):


metric="%2C".join([re.sub(":","%3A",i) for i in met])
dimension="%2C".join([re.sub(":","%3A",i) for i in dim])

dimension=dimension+"&filters=mcf%3AconversionType%3D%3D"+transaction_type+"&samplingLevel=HIGHER_PRECISION&max-results=8000"

if where_json('credential.json')==True:
with open('credential.json') as json_file:
storage_data = json.load(json_file)

client_id=storage_data['client_id']
client_secret=storage_data['client_secret']
credentials = client.OAuth2Credentials(
access_token=None, client_id=client_id, client_secret=client_secret, refresh_token=refresh_token,
token_expiry=3600,token_uri=GOOGLE_TOKEN_URI,user_agent='my-user-agent/1.0',revoke_uri=GOOGLE_REVOKE_URI)


credentials.refresh(httplib2.Http())
rt=(json.loads(credentials.to_json()))['access_token']

api_url="https://www.googleapis.com/analytics/v3/data/mcf?access_token="


url="".join([api_url,rt,'&ids=ga:',viewID,'&start-date=',start_date,'&end-date=',end_date,'&metrics=',metric,'&dimensions=',dimension,condition])

try:
r = requests.get(url)
try:
data=pd.DataFrame(list((r.json())['rows']))
print("data extraction is successfully completed")

table=pd.DataFrame()
if data.shape[0]!=0:
for i in range(0,data.shape[0]):

#print(i)
data1=pd.DataFrame()
data1=(data.iloc[i,:]).tolist()

for k in range(0,len(data1)):
if 'conversionPathValue' in data1[k]:
value_list=[]
for k1 in range(0,len(data1[k]['conversionPathValue'])):
value_list.append(data1[k]['conversionPathValue'][k1]['nodeValue'])
table.loc[i,k]=('>').join(value_list)
elif 'primitiveValue' in data1[k]:
table.loc[i,k]=data1[k]['primitiveValue']


if table.shape[0]!=0:

table.columns=[re.sub("mcf:","",i) for i in dim+met]
table['date']=start_date
return(table)

except:
print(r.json())

except:
print(r.json())

Now we have all parameters to extract the data. Here we need to keep in mind while passing dimensions (dim) and metrics (met) is you have to pass them as a list.

There are 2 transaction type values that is Transaction or goal. The Condition is thereby sorted by a metric [eg: &sort=-mcf%3AtotalConversions (sort by totalConversions descending order), &sort=mcf%3AtotalConversions (sort by totalConversions ascending order), etc]

  • Scenario 1: When Transaction type is ‘Transaction’ and without a condition
viewID='XXXXXXXX'
dim=['mcf:sourceMediumPath','mcf:campaignPath','mcf:conversionType']
met=['mcf:totalConversions', 'mcf:totalConversionValue']
start_date='2019-04-10'
end_date='2019-04-20'
transaction_type='Transaction'
refresh_token=refresh_token
condition=''

data=google_analytics_multi_channel_funnel_reporting_api_data_extraction(viewID,dim,met,start_date,end_date,refresh_token,transaction_type,condition)

output will be as below

  • Scenario 2: When Transaction type is ‘Transaction’ and has a condition
viewID='XXXXXXXX'
dim=['mcf:sourceMediumPath','mcf:campaignPath','mcf:conversionType']
met=['mcf:totalConversions', 'mcf:totalConversionValue']
start_date='2019-04-10'
end_date='2019-04-20'
transaction_type='Transaction'
refresh_token=refresh_token
condition='&sort=-mcf%3AtotalConversions'

data=google_analytics_multi_channel_funnel_reporting_api_data_extraction(viewID,dim,met,start_date,end_date,refresh_token,transaction_type,condition)

output will be as below

  • Scenario 3: When Transaction type is ‘Goal’ and without a condition

If you have already setup the goals in your google analytics account, you can access.

viewID='XXXXXXXX'
dim=['mcf:sourceMediumPath','mcf:campaignPath','mcf:conversionType']
met=['mcf:totalConversions', 'mcf:totalConversionValue']
start_date='2019-04-10'
end_date='2019-04-20'
transaction_type='Goal'
refresh_token=refresh_token
condition=''

data=google_analytics_multi_channel_funnel_reporting_api_data_extraction(viewID,dim,met,start_date,end_date,refresh_token,transaction_type,condition)

output will be as below

  • Scenario 4: When Transaction type is ‘Goal’ and has a condition
viewID='XXXXXXXX'
dim=['mcf:sourceMediumPath','mcf:campaignPath','mcf:conversionType']
met=['mcf:totalConversions', 'mcf:totalConversionValue']
start_date='2019-04-10'
end_date='2019-04-20'
transaction_type='Goal'
refresh_token=refresh_token
condition='&sort=-mcf%3AtotalConversions'

data=google_analytics_multi_channel_funnel_reporting_api_data_extraction(viewID,dim,met,start_date,end_date,refresh_token,transaction_type,condition)

output will be as below

--

--