The Modern Data Stack for Embedded Analytics

Jonathan E Cowperthwait
Cube Dev
Published in
6 min readMay 12, 2022

Here’s how the embedded analytics data stack has changed.

We’ve spent the past few weeks discussing the composition of headless BI and Cube’s approach to open source data modeling. Let’s briefly talk about where this technology meets the real world: embedded analytics.

What is embedded analytics?

Embedded analytics means much more than using iframes to place charts in a dashboard. The real promise of embedded analytics is bringing rich data experiences to users where they already are. As Gartner puts it, “data analysis occurs within a user’s natural workflow, without the need to toggle to another application.”

Financial analysis may be built right into a personal finance application. Personal health metrics can be displayed alongside one’s medical history. Every member of a team can see the business’s KPIs and trends without logging into a BI tool.

For businesses, embedded analytics affords full control over what data is available and how it’s presented. For data consumers, embedded analytics means finding useful insights within the sites or apps they already use — not in the same old dashboards.

At Cube, we’ve been on the embedded analytics front lines for years. Here are some recent technological and behavioral shifts we’ve observed.

The rise of the cloud data warehouse

In the past, collecting and processing massive data was so technically complex and expensive that only a few companies did so. Now, cloud data warehouses make it possible for organizations in every industry and of every size to easily and affordably collect and process more data. This has led to cloud data warehouses’ huge adoption.

But it is not enough to just collect lots of data. Businesses need to use the data, too. This has increased the need for customizable, powerful data experiences. As a result, companies’ adoption of cloud data warehouses has led to greater interest and investment in embedded analytics, too.

Data consumers are more diverse

At the same time as more businesses have taken interest in embedded analytics, there are also more, and more types of, data consumers. Business intelligence was once the purview of specialized engineers and analysts, but now, every member of an organization is expected to use and understand data to perform their job.

Who is the modern data consumer? For starters, they may be less technical. The modern data consumer may not be familiar with data engineering concepts like OLAP — and they shouldn’t have to be. The modern data consumer may not even be familiar with legacy BI tools — and, again, they shouldn’t need to be.

To make data accessible and actionable for these data consumers, companies need to build customized user interfaces that move beyond mere iframed graphs. Embedded analytics applications have grown more polished and interactive.

Front-end development is easier

A major change in the world of embedded analytics is the rise of modern front-end development tools. In the past decade, there have been incredible advances in powerful, simple front-end technologies — particularly React. It’s now easier than ever to design customized, polished user experiences.

This change hasn’t just made it easier to develop custom data experiences. It also has altered users’ expectations to make such experiences essentially mandatory. Both internal data consumers and end users have new, high standards for usability, richness, and responsiveness.

Since these expectations can’t be met with generic dashboards or out-of-the-box BI platforms, more companies are building customizable and performant embedded analytics features using modern front-end tools.

There are new tools

It’s become easier for front-end development teams to build highly custom native UIs, but these aren’t necessary for every project. Sometimes, standard visualizations are enough and speed is key.

A new generation of collaborative data tools, including Hex, Observable, and Streamlit, make it possible for data analysts to quickly select the appropriate chart for a dataset and share reports both throughout a company and to end customers.

Organizations need the freedom to choose the right tool for each job — whether it’s a custom application built by front-end teams or shareable dashboards built by data teams.

Rethinking the data stack

Thinking about these developments, what are the technical requirements of a modern embedded analytics data stack?

Data warehouse-centric architecture

As others have noticed, the cloud data warehouseis a place not only to store and process data, but also to make it interactive. To actually be a backend for data analytics applications, including embedded analytics, the data warehouse needs to be at the core of your data stack.

Having data in a single place significantly reduces operational overhead. On one end, ELT is replacing ETL, and on the other, we see a move away from moving data into “serving” databases. Instead, the trend is towards powering applications with data directly from warehouses. This architecture is simpler and less error-prone because it removes additional pipelines and storage layers.

Warehouses still predominantly work with batch data, but there are a lot of recent exciting developments in supporting real-time data. Firebolt, Clickhouse and Materialize are actively innovating in the space.

Finally, this architecture also opens an opportunity for bring-your-own-data-warehouse applications. This may be beneficial for both vendors and consumers because it simplifies the security architecture and streamlines new software onboarding.

Headless BI in the middle

We’ve previously described the four essential components of headless BI. To specifically support embedded analytics, a headless BI solution must have these attributes:

First-class support for cloud data warehouses

A major piece of this is access control integrated with the data warehouse’s security controls, because embedded analytics always require multitenancy. A second piece is advanced caching. This is because the data warehouse is a great candidate for a backend, but itself does not support high-concurrency small queries with the low latency that modern data consumers expect.

Data modeling

The diversity of new data consumers all should consume the same data: “net sales” in your dashboard should mean the same thing as “net sales” in my mobile CRM. To achieve this consistency, data modeling and metrics definition should be handled once, and this must be up-stack from every application or dashboard.

Diverse APIs

Different data consumers have different expectations and requirements, so it’s inevitable that a company will end up supporting a whole class of data apps. For all of these applications to share the same metrics, a headless BI layer needs to be accessible via multiple APIs, e.g., SQL, GraphQL, and REST.

A hybrid presentation layer

To support different data consumers, use cases, and teams, an embedded analytics presentation layer should be diverse.

When high customization is required and front-end teams are looped in, different charting libraries can be used, ranging from D3 to Chart.js and Highcharts. These most likely will be natively integrated with frontend application frameworks like React or Angular.

When requirements are less custom, tools like Observable or Dash come into the picture. It’s faster and cheaper to build with these tools, and they still can be natively integrated into an app and combined with more custom interfaces.

Data analysts and engineers can quickly build appealing data interfaces with modern BI tools like Superset and Metabase. These are less customizable, but they don’t require front-end knowledge. This usually enables even faster building and iteration. They too can be combined with other presentation layers within a single application.

Finally, there is the emerging category of no-code/low-code tools for internal tooling. This includes Appsmith and Retool. They too can be used to build analytics interfaces.

Conclusion

We’ve moved past the historical era when embedded analytics was considered a part of a single integrated BI solution. The latest generation of BI vendors already have innovated to run “live” queries directly on top of data warehouses, which allows for cloud data warehouse-centric architectures.

The next evolutionary step is to decapitate BIs and give organizations freedom to switch heads (frontends) or use many of them at the same time.

This architecture will be enabled by several products, including Cube. As the category matures, and more enterprises adopt this architecture, we’ll see more exciting developments. Buckle up.

Cube is an API‑first business intelligence platform for data engineers and application developers to make data accessible and consistent across every application. Get started with Cube for free today, or get in touch to discuss your next embedded analytics project!

--

--

Jonathan E Cowperthwait
Cube Dev

Marketing at Cube Dev. Past lives at @npmjs, @awesm, and making bad movies.