Architect Metrics Store Computation and Serving Layer

Metrics Store in Action #5–2-minute Tech Tok for 2 years’ Implementation

Lori Lu
Kyligence
4 min readJan 29, 2022

--

Kyligence — Metric Platform Case Study

In this blog, let’s dive deep into the design of metric computation and serving layer. In case you’ve missed the previous articles, here is where the story begins.

The Challenges — Thinking in Dimensions

The metric computation and serving layer is <KEY> to the success of their metric platform. Why is it so important? Let’s understand it in dimensions.

The Initial Challenge: Business is always about PEOPLE

In a standard business case, business stakeholders initiate a project. Then, the IT department delivers on the promise. Finally, onboarding business users happens by default. Unlike them all, this metric platform project is an initiative from the IT team, so it is NOT a must-have from the business perspective. In addition, adopting this new tool will require people to leave their comfort zone and learn something new, which makes it even more challenging to roll out the platform to a large user base.

The Follow-Up Challenge: How to Make it Go Viral Internally

Now, the roadblock becomes how to get the business buy-in this idea and even financially support it. Deeply influenced by B2C marketing strategy, the product team believes that how responsive your product is determines how users feel when interacting with it, which then determines customer stickiness — whether or not they will keep coming back to you in a competitive environment.

The Final Challenge: Architecting Low-Latency, High-Throughput, Petabyte-Scale Computation Engine

Now, it all comes down to the technical challenge — how to make this data product as responsive and user-friendly as possible? Is it even possible to create a consistent experience of second query response time at scale?

Further breaking down their technical challenges:

Fact NO.1: Petabytes of Data — Data Volume is always a big pain.

Fact NO.2: High-Throughput— They have thousands of concurrent users.

Fact NO.3: Low-Latency — Query response time must be under 3 seconds, especially for count distinct queries on high cardinality columns.

In conclusion, architecting Low-Latency, High-Throughput, Petabyte-Scale Computation Engine is the final technical challenge to be solved.

Pandora evaluates Hive, Spark, Impala, Druid, ClickHouse, Flink and Kyligence

After weighing different solutions, including Hive, Spark, Impala, Druid, ClickHouse, Flink, they decided to integrate with Kyligence Enterprise. The reason is quite simple — technically, they believe Dimensional Modeling is a perfect fit for the metrics computation layer, and Kyligence’s unique technology, Precomputation, perfectly matches this concept. In addition, Kyligence’s offering provides better scalability and flexibility.

Dimensional Modeling — the World is Dimensional

Dimensional Modeling is a data structure optimized to select, retrieve and summarize large sets of related data for analytics workloads. It is a form of denormalization that combines tables and reduces the number of tables and joins. The whole point of dimension design or denormalization is to simplify and speed up analytics queries. This is why Airbnb Minerva chooses Apache Spark and Apache Druid to perform data denormalization for the desired query latency.

The Solution — Kyligence AI-Augmented Engine

Here are the key factors considered in their technical evaluation process :

  • Modernized Dimensional Models ticks Dimensional Modeling

Kyligence data models are a variation of dimensional models but are modernized by incorporating trendy Big Data technologies, e.g., Spark 3, Parquet, Clickhouse, Alluxio, etc. Kyligence data models are distributed and support incremental data loading and dynamic file caching.

  • Kyligence Open-Source Engine ticks Low-latency & High-throughput

Kyligence’s open-source engine, Apache Kylin, provides low-latency, high-throughput OLAP capabilities at the petabytes’ scale. For more technical details of how, please read this blog.

  • Kyligence Smart Pushdown ticks Ad-hoc Query

“ The premise for a dimensional model, like it or not, is that the questions are reasonably predetermined. Many ad-hoc queries do not fit into the category of predefined facts and dimensions.”

- Dimensional Modeling, Star Schemas and Snowflakes by Stephen C. Folkerts

Query pushdown is designed to rout ad-hoc queries to source engines when a new metric is waiting to be backfilled in Kyligence. This is one of the significant features to create a close-to-zero time-to-insights experience for end-users.

  • Kyligence AI-Augmented Engine ticks EVERYONE

Kyligence AI engine wins everyone because it automates all the hard work, specifically data modeling & indexing, data backfills, incremental cubing and data refresh. The other key benefit is that Kyligence AI engine will guarantee a perfect index will be generated to match EACH metric’s query pattern and return query results in seconds.

The metrics platform only needs to tell Kyligence AI engine SQL queries for each metric, backfill window, and data refresh policies. Kyligence engine will handle the rest — generate indexes ( similar to materialized views in other engines), initial batched backfills, incremental data refresh and load, etc.

Summary

Over the last two years, the Pandora team has been investing more time in iterating toward the right compute infrastructure. It lays a solid foundation for a great user experience and eventually a viral metric product adopted by the entire company — real data democratization.

Stay tuned for more details on this case study!

Click this link if you want to know more about Kyligence!

Thanks for reading! Please share your thoughts in the comments.

If you are interested in this case study, please share, subscribe to my email list, or follow me on Medium for upcoming blogs.

💚 Special thanks to Rachel Beddor for proofreading the article!!! 💚

--

--

Lori Lu
Kyligence

Data, Strategy & Planning | Restaurant Industry