Data Integration for SaaS applications using CDAP

Bhooshan Mogal
Sep 23, 2019 · 5 min read
Photo by Luke Chesser on Unsplash

The last few years have seen tremendous growth in the adoption of cloud-based and SaaS applications in enterprises, especially in the fields of Customer Relationship Management (CRM), Enterprise Resource Planning (ERP), Marketing Automation, Human Resource Management (HRM), Customer Service Management and Supply Chain Management (SCM). The variety, quality, and popularity of these applications continue to rise due to technological advances in this field.

The SaaS market is anticipated to grow at a compound annual growth rate (CAGR) of 21.2% during the forecast period 2018–2023, according to a recent report by BusinessWire. Some prominent examples of these applications are Salesforce (CRM), Workday (HRM/HCM), SAP (ERP), Marketo (Marketing Automation), and so on. Due to rapid adoption, these applications are fast becoming a source of precious data, that enterprises want to increasingly tap into. They want to use this data in tandem with their other data and help them make impactful business decisions in real-time. Coupled with another (different) increasing trend of cloud adoption enterprises have a unique challenge — to integrate data from SaaS applications with the rest of their data, either in the cloud or in their own data centers. In this blog, I will discuss how CDAP is uniquely positioned to solve these challenges for enterprises and help them integrate SaaS applications with the rest of their data ecosystem.

Challenges in SaaS data integration


First and foremost, there is no single SaaS application that fits all customer needs. As we discussed earlier, sales and customer data tend to be in a CRM, employee data in HRM, business processes in ERP, and campaigns and leads in marketing automation applications. Anyone of these datasets is typically not even close to as useful as the sum of their parts. You usually want to use these datasets in tandem and join them together to get the most benefit out of them. However, since each of these datasets is in applications built by different vendors, their formats, taxonomies, and interfaces differ. As a result, to tap into these applications, customers typically have to create custom integrations for each use.

No direct access to data

Secondly, unlike traditional, legacy systems, enterprises do not have direct access to the databases that contain these datasets. Each vendor typically exposes an API for users to retrieve their data. While APIs have distinct advantages, they do have a few challenges, like unfamiliar (non-SQL) interfaces and rate limits exposed by vendors.

Cross-location data movement

Thirdly, integrating data from SaaS applications with other data sources almost always requires data movement across clouds and on-premises environments. For example, to get precious insights into customer experience, you have to integrate customer profiles (located in Salesforce cloud) with customer support data (located in Zendesk). To complicate further, their data integration and analytics software may run on another cloud or in their on-premises, requiring movement of data across these varied environments.


A number of these SaaS applications (such as Facebook, Instagram, etc.) also provide access to unstructured data (such as emails, messages, posts, images, and videos). It is just as important for customers to be able to tap into these sources of unstructured data as it is to tap into structured data.

Integrating SaaS data with CDAP

CDAP provides a graphical interface for data integration, which allows users to quickly and easily build data integration pipelines. It gives users easy ways of integrating data, whether it is located in a public or private cloud, RDBMS, NoSQL, streaming services, file systems, mainframes, or anywhere else. It also provides plugins to a variety of SaaS services as well, to allow users to tap into their CRM, ERP, SCM, HRM, or other SaaS applications.

Using CDAP to integrate data from various SaaS applications into an analytics environment of your choice

Using CDAP to integrate data from various SaaS applications into an analytics environment of your choice

Let’s explore a few key characteristics that make these integrations compelling


CDAP provides a standardized interface to integrate with all kinds of SaaS sources, thereby shielding users from vendor-specific disparities. They can connect to their data in these SaaS applications using an intuitive UI. CDAP also takes care of automatic schema mapping and also provides users the same set of transformations that are available for other sources as well.

Batch and real-time

CDAP allows users to process their SaaS data in both real-time and batch pipelines, without the need to write any code. For reporting use-cases such as analyzing the performance of their advertising campaigns, users can process data from SaaS applications such as Google Ads in batch. For more real-time use-cases such as getting access to sales opportunities for real-time analytics, users can process data from Salesforce in real-time.


No matter where your data is located or processed, CDAP can provide you a unified interface for integration. You can access data in various SaaS applications through simple, UI-driven interfaces. Using one of CDAP’s many modes, or through compute profiles, you can process the data wherever you would like. After integrating data with other sources, you can store it in a variety of systems such as on-premises and cloud data warehouses, for analytics purposes. See a previous blog or meetup talk for more details on CDAP’s hybrid data integration capabilities.


We’ve discussed previously how enterprises typically use different SaaS applications for dedicated purposes. CDAP’s metadata features provide standardized mechanisms for data discovery and traceability, no matter which SaaS application your data belongs to. With metadata capabilities, CDAP delivers a single pane of glass over all your data, including that in SaaS applications.

Get started today

CDAP’s generic HTTP plugins are now available, for you to connect to any SaaS application that provides a REST API. These plugins lay the foundation for building customized plugins for a variety of SaaS applications. Salesforce plugins allow you easy access to your Salesforce data in batch or real-time. Additional customized plugins such as SAP ECC, Marketo, Google Analytics, Google Ads, Facebook, LinkedIn and Bing Ads, Zendesk, Hubspot, etc. are in various stages of development, and will be available in the next couple of months. Feel free to get started with these plugins, and reach out to us if you would like to see (or contribute) a SaaS integration for CDAP. Also, stay tuned to this publication for dedicated blogs that run through use cases with some of these plugins soon.

Bhooshan Mogal

Written by

Product Manager passionate about simplifying complex data technologies for the end user



CDAP is a 100% open-source framework for build data analytics applications

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade