PinnedPublished inData Engineer ThingsBauplan: Operate your lakehouse with zero infrastructureFaaS data pipelines on S3Mar 20Mar 20
PinnedPublished inData Engineer ThingsI spent 8 hours learning Parquet. Here’s what I discoveredI finally sat down and learned about it.Aug 24, 202423Aug 24, 202423
PinnedPublished inData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at UberMar 23, 202421Mar 23, 202421
Published inData Engineer ThingsWhy is dbt So Popular?The motivation behind dbt and why it’s becoming a transformation standard(?)1d ago11d ago1
Published inData Engineer ThingsWhy Walmart Chose Apache Hudi for Their LakehouseWhat can we learnApr 104Apr 104
Published inData Engineer ThingsBufstream: Stream Kafka Messages to Iceberg Tables in Minutes8x cheaper than Kafka + native support for data quality and seamless transformation of Kafka topics into Iceberg tables.Mar 27Mar 27
Published inData Engineer ThingsHow Meta Solves Data Lineage At ScaleMeta’s Approach to Data Lineage: How They Did It and What We Can LearnMar 61Mar 61
Published inData Engineer ThingsKimball Dimensional Modeling OverviewIs it still valid?Feb 273Feb 273
Published inData Engineer Things8 minutes to understand PrestoUber, Netflix, Airbnb, and LinkedIn use this query engine.Feb 203Feb 203
Published inData Engineer ThingsLet’s build a data platform like Spotify!How they build and what can we learn.Feb 62Feb 62