Mining through the Pipelines: How We Accelerate Debugging

Sumant Patil
Hevo Data Engineering
5 min readNov 25, 2021

Hevo’s On-call Engineers strive to provide a smooth and seamless experience to customers by ensuring that their Pipelines run successfully. Debugging and fixing any issues faced by customers is, therefore, always handled on priority.

A successfully configured Pipeline centers around the following key areas:

  • Source: Where the data is ingested from. This might be a Database (ex. MySQL, MongoDB, etc.) or a SaaS application (ex. Salesforce, HubSpot, Intercom, etc.)
  • Transformations: These are transformations to clean or enrich your data.
  • Schema Mapper: It maps your Source Schemas to your Destination tables.
  • Destination: Where all the data collected from the Source is stored.

Debugging Database Sources was a critical use case for us, but it became even more convoluted with increasing SaaS applications. The APIs of these applications often change and hence can result in issues with our Data Pipelines.

To tackle these issues, we thought of designing & incorporating a set of tools to enhance our Debugging capabilities. Here, the primary aim was to empower & speed up our issue-resolution/Debugging process. It would enable our Support teams to rapidly analyse relevant data points & identify the root cause of the issues in no time. Moreover, a high-performing framework would facilitate our On-call Developers to get started and resolve the issues in the right nick of time. It would also benefit our Solutions & Sales teams to improve their turnaround time and deliver a befitting experience with timely solutions.

The Debuggability Framework was thus designed with the idea of providing us with the following to help efficiently handle the above scenarios:

A Clean Log of API Calls Made for a Particular Object

Since most of the issues we face are related to SaaS applications, a Log of all the requests made to the Source would help us pinpoint the data challenges with the specific Source. A Log contains the data used to make the API call, i.e. the URL of the call, headers, pagination fields along with their values, body of the request (if it’s a POST call), etc. However, the Logs should not contain any sensitive information like bearer tokens, passwords, or any other user credentials. Similarly, the responses received for the API calls made should not be Logged.

Time Taken by the API Calls

The timestamps at which the call was made and the response was received represent the total time taken for getting the API response. It can help us understand how much time it takes to fetch the data from the APIs and assist us in rare scenarios where there is a delay in Replication or Historical Load. This data can further be sent to a Time-series Database like InfluxDB for analysis of issues.

Building the Framework

Why did we build a new framework instead of using the existing ones:

Adding Logs Where the APIs are Being Called

As Hevo offers/supports a huge number of SaaS applications to integrate with, the number of places where API calls are being made to pull the data from is huge. Hence, it would be a really cumbersome process to write Logs at each place. This would result in a lot of code duplication and would also interfere with the code where the business logic is implemented. Hence, we wanted to implement the Debuggability Framework as a “Cross-cutting Concern”.

Making Use of Existing Guice Interceptors

We use Google Guice throughout our codebase and have already implemented interceptors for caching function results and recording the timing.

But, for us, one of the major limitations of using Guice Interceptors was that the instances must have been created by Guice via an @Inject-annotated or a No-argument Constructor. It was not possible to use method interception on instances that weren’t constructed by Guice. Since we use the `new` keyword at various places to create objects, especially for connectors, this was not usable.

Implementation

1. Aspect-Oriented Programming

To implement the Cross-cutting Concern, we decided to use Aspect-Oriented Programming. It facilitates the implementation of Cross-cutting Concerns via a unit of modularisation called `Aspect`.

All we had to do was define the `Advice` in our Aspect module. This module would then define the job that has to be done when it is invoked on our Pointcut. A Pointcut selects the joint points, i.e., places where the advice is invoked, for example, before calling a class function. So, in our case, implementation for the Logging part of our Connectors package lies in this Aspect module. Here, the joint points are defined on methods that actually implement the API call via an annotation.

Along with Logging, we also collect some statistics around the function. These are then loaded onto InfluxDB for further analysis if needed.

2. Contextual Logging via MappedDiagnosticContext

As Hevo provides a Data Pipeline as a Service to customers, each customer can have multiple Pipelines, and each Pipeline can have multiple objects that help ingest the data. Hence, just Logging the API calls would be pointless if we can’t identify the Pipeline or the object name for a particular Log. Further, it would be difficult to locate it in the Aspect module directly.

Since we use ‘Logback’ for Logging, we decided to utilise MappedDiagnosticContext (MDC) to populate information related to the Pipeline. The MDC manages contextual information on a per-thread basis. It provides ways for developers to put information into it, which can then be accessed by certain Logback components.

The data polling for each Pipeline object happens via a thread created by Handyman (the task manager of Hevo). Relevant information is added to the thread when ExecutorService creates it via Handyman. It does not require any extra effort from the developers while extending Logging to other connectors.

The `logFormat` has been modified to use these MDC variables when the Logs are written to files.

logFormat: %level [%date{ISO8601}] [%thread] %logger | %X {pipelineId}|%X {handymanTaskId}|%X {objectName}|%msg %n|

Next Steps

The framework has helped us debug On-call issues related to Sources a lot faster. We now plan to extend it to other components, for example, Destinations supported by Hevo and our Reverse ETL solution, Hevo Activate.

--

--