Why we made our Reverse ETL solution Open Source ?

Arun Thulasidharan
Castled
Published in
4 min readDec 6, 2021

After just two months of launching our cloud platform and after numerous customer feedbacks and interviews , we are happy to announce that we have open-sourced our Reverse ETL solution. We believe that this will enable more and more users to operationalise the data and dbt transformed datasets in their cloud data warehouse for their sales, marketing and support use-cases.

But why open source?

After launching our cloud version in early October, we started talking to potential customers for our cloud solution. To our surprise, 90% of the customers we talked to, immediately acknowledged the need for such a solution. They even told us about the non-reliable ways in which its solved right now in their organisations and the data consistency issues they are facing on a day-to-day basis.

So looks like a perfect product-market fit for us? Well, No! Not even close.

They normally proceed to ask us a few more routine questions.

  1. How many connectors do you have? How many do you plan to have in some timeframe(say a year)?
  2. Do you support these connectors(gives a long list of SAAS apps they use internally)?
  3. What is the SLA to build a new connector, after we raise a feature request?
  4. What is the SLA to fix issues in production?

Now we knew that the answers to these questions were tricky. Lets see what the honest answers to these queries are, with our closed-source cloud solution.

Number of supported connectors and mid-term roadmap

We currently support all the major data warehouses and 12 destination apps and are targeting somewhere around 60–70 connectors by the end of 2022. This is nowhere near the number of connectors which we have to eventually support, which is in tens of thousands with new apps being added everyday.

Support for all existing apps of an organisation

Its estimated that a mid-size organisation uses around 40+ SAAS apps. There is no way a closed source solution was going to support all the apps used by an organisation and we did not either.

SLA to build a new connector, after a feature request

Now for an early stage startup like us, the SLA for adding a new connector is probably less than a week for our early customers. But we knew from our previous experience that once we scale up, it is not going to be that simple. Normally in slightly mature data pipeline solutions, feature requests are collected and evaluated at the end of the quarter and the connectors in demand are prioritised to be integrated in the next quarter.

Fivetran gets around 20 new connector requests and around 75 feature requests in just a month, as indicated by their support console. So if you are not using one of those “popular apps”, then this SLA is probably in years.

SLA to fix production issues

Again the SLA to fix productions issues is in hours in early stage startups and days or even weeks in case of more mature solutions.

What is the root cause of these customer concerns?

The customers were raising these concerns based on their previous experience with some closed source data pipeline solutions, mostly ETL.

  • They wait for months or even years for their connectors to be integrated into their paid data integration solutions.
  • As the organisation scales up, their cloud solution fails to keep up with the exponential growth of the SaaS apps in use and mostly they subscribe to multiple vendors to cover even half of their use-cases.
  • They eventually get frustrated waiting for feature requests and stop using solutions from the third party vendor and start building in-house solutions to support their use-cases.

They also mentioned that the open source ETL solutions like airbyte was able to solve most of these issues because of these reasons.

  • A seamless and swift way to integrate any custom connectors into the solution in just a matter of a few days.
  • Flexibility to customise an existing connector to fit their requirements.
  • Flexibility to fix a production issue in an existing connector, which might be a burning issue for them.
  • Community-led growth resulted in a huge number of connectors being added in these open source solutions.

So open source our cloud solution then?

After a few customer feedbacks, it was clear that a closed-source solution will not be sustainable for any organisation after a point of time. If you think this is not the case with your organisation now, think about the scenario when you scale up. Do you think a closed-source solution will be able to keep up with your fast growing data integration use-cases ?

We did not take much time to come to the conclusion that open source is the only way to solve the key customer pain points. We had to rewrite a large chunk of our code, focusing heavily on the ease of adding new connectors. We will be writing about how we achieved that in more detail in a future blog post.

Update: Castled has pivoted out of a pure-play Reverse ETL solution to provide a better way to solve the same problem for non-technical folks without involving the data teams. Our Audiences feature enable marketers and product teams to create customer segments directly on top of the data warehouse and sync them to marketing and advertising platforms

--

--

Arun Thulasidharan
Castled
Writer for

Co-founder & CEO @ Castled.io - Warehouse Native Customer Engagement Platform built natively on top of your cloud data warehouse