Kapil Bahadur
Hevo Data Engineering
5 min readJan 20, 2021

--

Building SaaS Integrations — A PM’s perspective

At Hevo, integrating new data sources is an ongoing phenomenon. We build integrations across Databases, SaaS applications, SDK’s, and storage systems that allow a user to sync their data to a preferred data warehouse at the ease of a click in near real-time.

With SaaS companies embracing the concept of offering open API, integrating them requires a thorough understanding of the application’s working, along with the underlying data architecture that it functions upon. Our goal is to deliver the most accurate and normalized format of this data to our users, the process of which is time-consuming as it revolves in a closed loop of application usage, API testing, and data review.

The entire project, from initial concept to final deployment takes around 4–6 weeks requiring the complete dedication of one Software Development Engineer, who is frequently mentored by an engineering manager, a code reviewer, a Quality Analyst, and a Technical Writer.

From a Product Manager’s perspective, there are 5 broad areas to cover to ship these integrations from concept to actual deployment in the market.

Product Understanding

The first step of building a connector is to understand the purpose of the application, and the various logics that cause it to function and solve problems. Best practices to understand any product are to thoroughly use it first hand, study its documentation, and most importantly, speak with users on how it solves their problems. Most applications provide sandbox environments to test them out, else we create testing accounts and use the product to try and achieve different use cases.

In the field of SaaS, there are numerous applications for almost any task whether it’s Marketing, Analytics, Cloud Applications, Retail, Payments, Finance, and many more. It is quite common that their consumers find different use cases from one another, centric to their business problems since different applications provide different data — such as transactional, analytical, core information, etc. Hence, it is a crucial phase, to understand these use-cases.

Once familiar with the product or application, we can understand various underlying entities and how they relate to each other. Just how we can mentally map the relations in families, chronology in a textbook, this study helps us lay down the foundation of what will later evolve into a normalized data schema to meet diverse use cases.

Schema Design

Once familiarised with the product, the next step is to study the Developer Documentation of that product and study their API architecture. This is generally followed up by testing out the APIs through the framework defined throughout their documentations.

Different applications have different API architectures and different means of authenticating them. Most companies support RESTful APIs which communicate between a client (our users) and a server (the product or application we’re using) wherein we request information and translate the responses in different circumstances. Similarly, we integrate with APIs on different architectures like Soap, SDK’s, Webhooks, etc.

Once we start studying API responses, we’re able to draft the entities and their relations in an Entity Relationship Diagram (ERD) which is refreshed and updated repetitively over for days, till we can finalize on a universal format that fits all or most of the use cases without compromising on the accuracy of the data. This final form is what we call a ‘Normalized Schema’ which is consumed by our users when they use Hevo.

We ensure that this Normalised Schema is published in our documentation so that our users are aware of the end deliverable in their data destination, which is generally a data warehouse.

All findings are summarised in a Product Requirement Documents (PRD) which conveys the vision and scope of the integration at hand.

Development & Testing

Once we have a thorough understanding of the product and the schema that we wish to create, the PRD is handed over to development teams, who then start building these integrations.

The development phase ensures that the business logic for an application is followed while creating data schemas on our end. This means that all entities’ relationships should stay updated subject to changes in any one of them. These are generally new entities, updated entities, or archived entities. While most APIs have provisions to track this in a structured format, quite often we’re required to draft data increment, data update, and data refresh flows. Various types of data synchronization flows supported can be found in our documentation here.

We are required to test out the non-conventional flows which often introduces limitations that may be unlisted on official documentation at times. These limitations range from API rate limits, data refresh triggers and data restructuring. We ensure that any such limitations do not compromise data availability or data accuracy by building workarounds even in most complex cases

In parallel to the backend development, each connector is supported with a user-facing interface to configure the source connector itself. Once done, the connector goes through multiple rounds of testing before it goes to the next stage.

Documentation

We include an official documentation for each integration we build.

It is essential to anticipate how the end-users will approach our documentation as different people have different learning curves. Keeping this in mind, we build thorough sections that explain the scope of integration, instructions on how to set it up, include schema along with their ERDs, and laying out any limitations that it might suffer.

We also use screenshots and images throughout the documentation so that the end-user can relate to it, and have a positive experience.

Release

Once the integration development is complete, documentation is updated, a new developer reviews the codebase, and QA teams perform various tests including a regression test, which proves the integrity of the new connector with the rest of the product.

Post this, the integration undergoes demos within the organization to get feedback from external perspectives. People from different teams such as Sales, Marketing, Administration provide feedback which is further taken into account.

Once all stakeholders have approved of the robustness of the connector and confirms the accuracy of the data, the integration is said to be ready for deployment.

We usually deploy integrations staggered across regions with gaps of one week to understand how it’s functioning and confirm the stability of the release.

Even after releasing the integration, we keep fine-tuning how it functions to meet custom requirements from our users. We often include these minor-changes either in the product or within the assisting technical documentation to keep the entire product experience up-to-date.

To see how Hevo works, and how you can leverage near real-time data with the ease of a few clicks, sign up for a free trial or check out our documentation.

--

--