Building petabyte-scale analytics with BigQuery and HLL

Paulius Imbrasas
Aug 6, 2019 · 11 min read

One of Permutive’s most valuable features is providing analytics to our publishers; from the simple number of unique visitors to a page to multi-axis filtering by things like country, visited page and intersections with multiple segments. Think of it as advanced Google Analytics.

As we’ve developed this functionality we’ve learned that providing performant, cost-efficient analytics while maintaining high accuracy is really hard. We’ve learned a lot during this process, particularly about BigQuery and HyperLogLog.