Headless and reverse [data] products

Ivan Begtin
4 min readMay 29, 2022

--

Last month I read a lot of texts about modern data stack (MDS) — it is well described by Fivetran [1].

Modern Data Stack is a combination of data products for data integration. An idea is to use your business data to open new possibilities and improve its effectiveness.

These products include:

  • ELT (Extract-Load-Transfer) and data transformation tools
  • data warehouse/lake/cloud storage
  • data visualization / BI
  • data science and machine learning

There are several interesting publications about products inside Modern Data Stack. I would recommend Emerging Architectures for Modern Data Infrastructure [2] and The Modern Data Stack [3].

And I would like to mention two major interesting trends, from one side expected, from another side they make difference between modern and not-so-modern data stacks.

It’s a headless and reverse data product type. First of all, it’s headless BI and reverse ETL.

Headless BI

Headless BI is a new idea for how to work with data metrics. It divides data for analytics and its visualization. The headless product idea is that data could be used and represented in many ways: web UI, desktop app, Jupyter Notebook, command line, or something else.

These visualization tools could be created by different teams but they need standard and unified sources, and metrics sources. Headless BI solves this situation as metrics storage.

Supergrain Headless BI. Source https://supergrain.com

For example, Supergrain[4] or a very similar startup Good Data [5]. Lots of experts write about it too [6].

An idea to be headless for BI is not a completely new concept. There is a similar trend to headless CMS. A lot of content management systems right now are created with an admin-only management interface to manage news, webpages, and media files. Company content could be presented as a public website, corporate website, event website, mobile app, and so on. So headless CMS is about centralized content management and multiple ways of content representation.

There are a lot of headless CMS products, a good list of them available at Jamstack [7]. Other headless products exist in eCommerce [8] platforms.

So headless BI seems to be very logical and it’s interesting how it will develop in the future and how other products with business logic and visual presentation could use the same concept.

Reverse ETL/ELT

Classic ETL (Extract-Transform-Load) tools are well known and they are easy to understand and use. For some tasks in modern data stack ETL is often replaced by ELT, with load and transform stages shifted.

if you use ELT you upload your raw extracted data to the data storage/lake and start the transformation task after that.

In parallel new product types appeared — reverse ETL. If ETL/ELT is constructed to collect data for future centralized processing, Reverse ETL takes processed data from data storage and transfers it to the operational databases/original data sources.

It could sound strange, but it’s quite a common task for example if you calculate any personalized metrics after data analysis/applying machine learning algorithms and you would like to provide personalized offers to your clients.

There are many reverse ELT/ETL products and more detailed explanations provided by, Census [9]m and Hightouch [10], for example.

Reverse ETL pipeline. Source https://hightouch.io/blog/reverse-etl/

And the idea of reverse ETL is interesting by itself for anyone who creates new data products. Most ETL/ELT products work with multiple data sources and several destinations. Reverse ETL products require much more knowledge about two-way integration with API of marketing, CRM, and other customer-related products.

Is it popular somewhere else?

These ideas of headless (no UI/GUI) and reverse are not unique and are popular outside IT too.

For example, shadow kitchens are also some kind of “headless” (no UI). You could have a single kitchen with multiple interfaces for different types/categories/interests of clients.

May we apply these ideas to other data or IT-related software to create new business models or products?

I feel that the headless concept is applicable for products that we understand as UI/GUI-only. For example, could we imagine headless editors? or headless games with multiple interfaces. Some could say how it’s different from command-line tools, but the difference is that the command line is also just one more interface of interaction with a user. So headless games, does it sounds good? or too strange?

And about the reverse, it could be applied to products with tasks pipelining from well-prepared-product to raw or well-prepared data. For example, could we imagine reverse CMS to reconstruct website content from its HTML sources, or maybe reverse data analytics to reconstruct data from published PDF/images/JS reports?

Personally, I don’t have answers yet, but Writing is Thinking [11] so I am writing.

References:

[1] https://fivetran.com/blog/what-is-the-modern-data-stack

[2] https://future.a16z.com/emerging-architectures-modern-data-infrastructure/

[3] https://www.moderndatastack.xyz/

[4] https://www.supergrain.com/

[5] https://medium.com/gooddata-developers/the-future-of-bi-is-headless-e3949bb0bf2

[6] https://basecase.vc/blog/headless-bi

[7] https://jamstack.org/headless-cms/

[8] https://github.com/notrab/awesome-headless-commerce

[9] https://blog.getcensus.com/what-is-reverse-etl/

[10] https://hightouch.io/blog/reverse-etl/

[11] https://blog.stephsmith.io/learning-to-write-with-confidence/

--

--

Ivan Begtin

I am founder of APICrafter, I write about Data Engineering, Open Data, Data, Modern Data stack and Open Government. Join my Telegram channel https://t.me/begtin