Adobe Analytics REST API — Part 1

Vivek Sasikumar
3 min readFeb 27, 2024

--

Data Ingestion using Microsoft Fabric

This is my story of how I recently worked on a digital marketing analytics project for a major retail client. Adobe Analytics is a digital customer journey mapping tool which helps measure and provide insights for digital sales strategy.

Requirement

The mandate was to analyze their omni-channel (digital, call centre, chat, frontline and retail channels) sales conversion from landing page all the way to confirmed sale in their revenue management system. For all their digital, call-center, chat and frontline channels, the company uses Adobe Analytics on their apps and websites.

The scope of this story only deals with my journey through understanding Adobe Analytics API and creating an automated ETL Python workflow to get Adobe data with a myriad of complex business hierarchies like channel, product types, customer types, buyflow journeys, segments, promotional offers, Adobe metrics etc.

Data Engineering journey with APIs

First thing I started with is deep diving into Adobe REST API documentation to understand how data can be retrieved. To retrieve data from Adobe Analytics via REST API, one has to get an OAuth token for a project from Adobe Analytics administrator at your organization. Following are the data required to create a config.json file:

  • orgId
  • clientId
  • token

Once we have this data, save it as a JSON file in your vault or as a file in your OneLake. Fabric workloads can be used to create OneLake datalake with controlled access confidential file storage. The JSON file should look like this.

{
"org_id": "ABC",
"client_id": "DEF",
"token": "abCd35"
}

Now we start with our Python development. Open MS Fabric app and navigate to your workspace. Create a new Python notebook (you can also work on Jupyter notebook on your desktop and import it MS Fabric).

I found this wrapper for Adobe Analytics which reduces the code needed for API data pull significantly. Special thanks to Julien (https://github.com/pitchmuc/adobe-analytics-api-2.0/commits?author=pitchmuc) who created this library.

Since the MS Fabric Python Environment only installs a library temporarily, I set up the following library installation snippet.

#Check if aanalytics2 library is installed
try:
import aanalytics2 as api2
except:
!pip install aanalytics2

API Authentication

Next step is authenticating OAuth connection with your config.json file.

import aanlytics2 as api2
api2.importConfigFile('<your MS Fabric path>/config.json')

## Instanciating the Login class
login = api2.Login()

## Retrieving the company id
cids = login.getCompanyId()
ags =api2.Analytics(cids[0]['globalCompanyId'])

In my use case, I needed to get data from different applications (mobile, website, frontline tools etc.), buyflow journeys (new account creation, hardware upgrades, price plan changes, contract or service renewal, promotional offers etc.). Metrics to be brought in were unique customer visits, drop-off count, conversion percentage, channel data etc.

Adobe Hierarchy Dictionary

My approach was to create a dictionary with all the different applications within your Adobe organization (termed as Rsid), buyflow & channel segments defined by Product Owners and tech/business validation errors defined by Product Owners. Please note that these segments have to be built in Adobe Analytics prior to our ability to get the data into our datalake. In my version, I intend to create one function that incrementally refreshes to cycle through each Rsid and Segment to get a single OneLake table with all the data for every day of the month for the last 3 years.

Rsid_list = {
"aaa": "Website",
"bbb" : "Mobile App",
"ccc": "Call Centre",
"ddd": "Retail"
}

#Buyflow Segments
List = {
"s...": "Buy flow = New Account",
"s...": "Buy flow = Upgrade",
"s...": "Buy flow = Price Change",
"s...": "Buy flow = Renewal",
"s...": "Buy flow = Promotions"
}

#Origination Channel Segments
callcentre = ["s...","s...","s..."]
retail = ["s...","s...","s..."]
app = ["s..."]
website = ["s...","s...","s..."]

#Error Segments
Segment_list= {
"s...":"Tech Error",
"s...":"Business Validation"
}

Now we come to JSON Serialization part. The API pull uses a JSON query to get specific data from Adobe Analytics. In Adobe Analytics, for any table that we have, there will be a JSON format that we can pull.

This will be discussed in part 2 dedicated to JSON serialization and data pull.

--

--