Why are Google BigQuery, Snowflake, Redshift and other cloud data warehouses slower than most expect? — Part 3

Lori Lu
4 min readDec 3, 2021

Now, it is time to discuss the juicy part of this blog series — How Apache Kylin works to bend the curve and make the exponential growth of data independent from cost & query performance.

If you have not read the previous blogs of this series, please go to the following links— Part 0, Part 2. ( yep, there is no Part 1🥁)

Precomputation Shrinks Big Data

Design Principles 9: Precomputation
Photo by Sociomedia

In an MPP query engine, a typical query processing will go through 5 steps as illustrated below — data scanning, joining, filtering, aggregation, sorting.

Precomputation, simply put, is taking the heavy lifting work offline, including joining and aggregation. Those two steps are the most time-intensive and labour-intensive parts of query processing. At query runtime, rather than calculating original raw data on the fly, only a minimal portion of post data processing is expected on…

--

--