Published in


Marketing Data Warehouse with CDAP

Marketing Data Warehouse is a critical component for enterprises to drive forward their marketing analytics initiatives. These initiatives range from understanding the ROI on their advertising spend to fully understanding their customers’ journey for more advanced machine learning use cases such as Lifetime Value prediction, Churn prediction etc. All of these initiatives need the siloed data from multiple platforms, systems and databases to be brought together in a central data warehouse for technology and marketing teams to deliver on Marketing use cases. Marketing data also comes in various shapes and forms, and needs to be transformed before it can be used in downstream use cases. Finally, enterprises have to build and maintain multiple point-to-point integrations and tools to get Marketing data sources into their data warehouse.

Marketing Data Warehouse with CDAP

In this article, we will look at key business and technical challenges that enterprises face when trying to build a Marketing Data Warehouse. We will also look at how CDAP can help with those challenges and accelerate enterprises’ path to building a holistic Marketing Data Warehouse.

Business Challenges

  • Siloed marketing data: Marketing data comes from multiple on-prem systems, SaaS platforms and databases. This heavily siloed data also needs to be transformed and merged before it can be used to deliver business objectives such as personalization. Enterprises struggle to collect all of these critical datasets easily and quickly, delaying their strategic marketing initiatives.
  • Time to insights: Marketing use cases typically take longer to deliver due to heavy dependency on getting data from multiple systems and performing transformations/normalizations before the data can be used. Multiple integration scripts and pipelines need to be designed, developed and maintained to continuously keep getting data into a marketing data warehouse. Such manual integrations are difficult to build, maintain and are error prone. Longer integrations affect business opportunities, resulting in slower time to market, media waste, and risk of churn with existing customers. It also hampers business’ ability to identify insights and leverage those insights to develop targeting campaigns.
  • Single Customer View: Enterprises are exploring/using Customer Data Platforms (CDP) to build a 360 view of the customer, create audience segments and activate them across multiple channels. A Marketing Data warehouse can accelerate enterprises’ path to start leveraging a Customer Data Platform. Enterprises can integrate their Marketing Data Warehouse with a CDP of their choice to make use of consolidated marketing data.

Technical Challenges

  • Heavy design and maintenance: Technology teams need to build and maintain multiple integration points (FTPs, APIs, databases etc.). These teams also need to transform and normalize the marketing datasets through various tools and products before the data can be used.
  • Data transformation challenges: Designing and developing ETL (specifically transformations) applications is complex. It requires either programming skills to develop DIY scripts or deep expertise in complex ETL products. Both of these options require specialized skills, presenting challenges to faster delivery of initiatives.
  • Unstructured data: Marketing teams would like to leverage unstructured datasets such as customer voice interactions, documents etc. to gain insights using ML APIs, however it is difficult to build such transformation pipeline quickly and easily.

CDAP for Marketing Data Warehouse

CDAP is an integrated, open source application development platform. It provides developers with data and application abstractions to simplify and accelerate application development, address a broad range of real-time and batch use cases, and deploy applications into production while satisfying enterprise requirements. You can get started with CDAP with either on-prem or cloud deployment options.

CDAP can help enterprises accelerate their path to building a Marketing Data Warehouse by solving for the above business and technical challenges, specifically on two critical fronts — Data ingestion and Data transformation.

Data Ingestion

Customer Datasets CDAP provides plugins to connect easily and securely to your on-prem or cloud databases such as SQL Server, Oracle etc. This allows enterprises to get customer data points into their Marketing Data warehouse quickly. CDAP also has plugins for Salesforce that allow you to get additional customer and/or sales data from Salesforce into your marketing data warehouse.

SaaS Datasets There are a number of plugins in various stages of development that will allow enterprises to bring a wide variety of domain-specific datasets into a Marketing Data Warehouse. Below plugins will be available in the coming months:

Web Analytics — Google Analytics 360, Omniture, MixPanel

Advertising — Google Ads, Google Campaign Manager data, Facebook Ads

Marketing and Automation — HubSpot, SendGrid, Marketo

ERP — NetSuite, SAP ECC

Social — Instagram

Customer Success — Zendesk

CDAP makes it extremely easy to get the data from these complex SaaS data sources with simple configurations. Once setup, CDAP will maintain the data transfer process and alleviates the pain around multiple, manual point-to-point integrations. With point-and-click setup for this plugin, it also improves the overall speed and will help enterprises deliver better, precise and personalized marketing and Ads to their customers. Finally, automated data ingestion with CDAP will allow Technology teams to focus on other complex value-add initiatives.

Data Harmonization

All of the marketing and customer data ingested above need to be transformed and brought together so that it can be leveraged to drive forward marketing initiatives and use cases such as lifetime value of customer, prediction, personalization etc. CDAP delivers two key capabilities that can be leveraged to perform transformations and joining/merging of siloed datasets.

Wrangler allows you to visually and interactively cleanse and prepare raw data, with the aim of making it consumable for further processing. With Wrangler and custom directives, business users can transform marketing data sets to make them ready to be joined and/or merged with other data sets in Marketing Data Warehouse. For example, you might want to un-nest nested records coming from Google Analytics 360 click-stream data so that flattened data can be used for reporting purposes.

Pipelines allow business users to join/merge diverse marketing and CRM data sets together (as long as the user has identified a join field). As the new data becomes available, you can set up schedules to bring in incremental CRM and SaaS data. Pipelines would allow you perform harmonization across these siloed data sets and bring together a more richer 360 view around your customer.


Building a comprehensive Marketing Data Warehouse is a complex initiative and has its own inherent challenges, however CDAP is uniquely positioned to accelerate the path to a Marketing Data Warehouse quickly, easily and securely. It would not only allow marketers to get access to the data much faster, but will also unlock multiple new marketing initiatives for the business (such as insights from unstructured datasets). Finally, Accelerated build of Marketing Data Warehouse with CDAP moves your focus away from infrastructure and allows you to focus on customer and marketing insights.

We welcome you to start your journey to build a holistic Marketing Data Warehouse with CDAP. Feel free to get started with some of the available Marketing plugins, and reach out to us if you would like to see (or contribute) a SaaS integration for CDAP.




CDAP is a 100% open-source framework for build data analytics applications

Recommended from Medium

Linear Regression- Machine Learning

Building Really Good Backtesting Models

Macroeconomics Data and Macro Indicators API

7 Tips For Data Science Newbies

Seven things to do after installing KNIME Analytics Platform

What causes people to commit suicides?

Five Annoyingly Misused Words in Data Science

Supply Chain Planning : Statistical Forecasting

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Prateek Duble

Prateek Duble

Prateek is a Data Analytics specialist with Google Cloud, who is passionate about making customers successful with Cloud Data Analytics products

More from Medium

Martech Interview with Alex Chapko on Analytics platform

Universal Analytics is Dead — is Google Analytics Dead?

Free Google Analytics Alternatives

Data driven culture starts here

5 Data-driven Approaches for a Successful Cloud Migration