Headless BI vs. Self-service BI

Patrick Pichler
Creative Data
Published in
6 min readJan 10, 2023

Brief introduction to Headless BI and putting Power BI into its context

Photo by Yahor Urbanovich on Unsplash

Introduction

Over the last couple of years, we have seen various new terms and inventions coming up around data and analytics, Headless BI is doubtlessly one of them.

In this article, we will first briefly explore what Headless BI is all about followed by the conceptual comparison to self-service BI. In particular, we will check whether Power BI meets the requirements of the Headless BI category and when it’s better to choose one over the other.

Headless BI

To start with, Headless BI is essentially an architectural style that decouples your persistent backend storage, be it a database or data lake, from your user-facing frontend to consume data. The main idea here is to create a loosely coupled layer in-between that gives you access to all information incorporated into a single abstracted data model representing the business/use-case without being tied to any specific frontend. Instead, this so-called semantic layer is getting exposed via a rich set of standardized interfaces and protocols — making it “headless”. The primary focus is hence totally on data delivery by providing fast and easy access to a curated single source of truth rather than on any kind of data visualization. This data delivery usually consists of standardized dimensions, metrics, hierarchies, KPIs and includes features such as data access control (RLS), aggregation capabilities, caching mechanisms, etc.

In terms of use-cases, Headless BI gets very often mentioned in the context of embedded analytics. This has also always been the primary objective of one of the today’s most popular Headless BI technologies, called Cube, which we will also use as a kind of benchmark throughout this article. Cube was initially even written entirely in JavaScript, THE programming language for building interactive web applications. Though, nowadays, lots of code has been rewritten in Rust, also to expand its functionalities towards the connection of self-service BI tools such as Power BI or Tableau. Anyway, its rich connectivity together with the seamless integration with modern development languages and frameworks (SDKs), makes it quite easy to embed analytics natively into (web) applications. Besides easing this embedding process, its centralized concept further ensures data consistency across multiple applications. Another very popular use-case is real-time analytics. Headless BI tools tend to facilitate the combination of streaming and batch data at query processing time, apart from mechanisms to automatically update the frontend with the most current data available.

Traditional BI

Now, if especially this decoupled central semantic layer concept of Headless BI sounds familiar to you, then you have probably been in touch with traditional BI architectures based on OLAP (cube) technologies such as Microsoft Analysis Services or IBM Cognos. While their centralized idea is truly very similar, those OLAP technologies use to come with its own analytical engines which are mostly limited to a few interfaces and frontends while Headless BI seeks to provide a kind of universal access without any dependencies.

However, this centralized semantic layer approach changed anyway quite a bit with the rise of self-service BI tools as the next generation back then. The semantic layer including the dedicated analytical engines started to get tightly coupled with the visualization layer. A trend which could be now a potential driver for the today’s popularity of Headless BI? Let’s find out.

Power BI

First things first — Power BI obviously falls under the category of self-service BI tools. It has everything you need for conducting BI packed into one single tool, even data integration capabilities. At first sight, this sounds indeed very tempting and organizations increasingly started to adopt such self-service BI tools to be more flexible and agile for new requirements rather than depending on central IT departments. Business users themselves were suddenly enabled to build and design their own reports including the semantic layer behind it or at least some parts of it. Of course, such a decentralized approach allows single domains to move much faster, but it also requires a set of rules to follow together with a certain sense of responsibility, otherwise things can get chaotic very quickly. A falling data quality is often just a matter of time due to emerging data inconsistencies and different interpretations or definitions. In the end, the decision-making process might be faster but also better?

Probably not, but anyway, here is the point, with Power BI you can still stick to the central semantic layer concept similar to the idea of Traditional and Headless BI architectures. You just need to follow a commonly known architectural development concept called Power BI Shared Datasets. In practice, this simply means to not always publish a new dataset attached to the report, but instead, you centrally host well-designed pre-built datasets with just empty reports attached. Business users can then live connect to these datasets and build multiple reports on top of them by always reusing the same semantics. This way, your Power BI reports (visualization layer) are totally decoupled from your Power BI dataset (semantic layer) which also makes your semantic layer basically “headless”. Nevertheless, even by following this development concept, it’s still often difficult to totally prevent users to create their own datasets (semantic layer) just because of the nature of how self-service BI tools work. That’s why you further have the option to certify and promote Power BI datasets. While certifying helps to highlight high quality content meeting organization’s standards, promoting datasets allows users to mark datasets as being valuable on their own.

All right, so far so good, but does this now mean that Power BI meets all the requirements of a Headless BI tool just because you can uphold the idea of this central semantic data model?

Headless Power BI

Not quite, since the main point of Headless BI is actually to be frontend-agnostic by providing universal compatibility for integrating with any tool or framework through easy query formats and a broad range of standardized APIs such as SQL, REST, GraphQL, WebSockets, etc. This is where Power BI cannot keep pace, it just supports the SOAP-based XMLA interface including a REST API for executing DAX queries. This REST API even comes with some throttling and size limitations if you are not using Power BI Premium. For this reason, its difficult to query the semantic layer of Power BI for rendering query results natively within (web) applications, instead you would rather end up embedding Power BI content directly (reports or tiles). This would also support real-time scenarios in case of embedding content based on Power BI real-time datasets. However, this embedding functionality not only comes with extra costs (Power BI Embedded or Power BI Premium) but also causes typical iframe usability issues making the entire look and feel kind of odd compared to a native integration.

At this point, it needs to be mentioned that we have just discussed arguments in favor of using a Headless BI tool compared to Power BI, but not the other way around. The capabilities of Power BI are tremendous and it is arguably the best self-service BI tool on the today’s market. For instance, just by comparing the performance for processing and returning query results, then the Power BI’s integrated VertiPaq engine is almost unbeatable. Headless BI tools, on the other hand, pass queries on the backend for processing (if not cached) followed by retrieving the results back via REST or GraphQL API which obviously doesn’t provide the same performance — but more flexibility. This is generally one of the main tradeoffs between tightly and loosely coupled systems.

Conclusion

To conclude, there is no question that we have to deal with an increasing amount of data each day. Organizations generally start to choose a more data-centric approach by modernizing or laying out their (enterprise) IT architectures. Further, analytics isn’t just reporting and BI anymore. Likewise, building applications isn’t just supporting transactional (OLTP) workloads anymore, and at the same time, also the demand for data streaming and real-time dashboards is on the rise. These are all plausible arguments for adapting such Headless BI tools, especially if the main focus is on embedding analytics natively into applications and to generally support multiple APIs/SDKs. On the contrary, if reporting and BI is of prime importance for your organization, then you should definitely choose a self-service BI tool such as Power BI — or maybe even running both side-by-side pointing to the same backend. In any case, if Power BI decides to generally reduce its REST API limitations together with providing a simple SQL-like interface on top of DAX, it would drastically improve its interoperability which also means coming another step closer to the Headless BI category. Let’s see what the future will bring.

--

--

Patrick Pichler
Creative Data

Promoting sustainable data and AI strategies through open data architectures.