PinnedPublished inData Engineer ThingsBauplan: Operate your lakehouse with zero infrastructureFaaS data pipelines on S32d ago2d ago
PinnedPublished inData Engineer ThingsI spent 8 hours learning Parquet. Here’s what I discoveredI finally sat down and learned about it.Aug 24, 202423Aug 24, 202423
PinnedPublished inData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at UberMar 23, 202422Mar 23, 202422
Published inData Engineer ThingsHow Meta Solves Data Lineage At ScaleMeta’s Approach to Data Lineage: How They Did It and What We Can LearnMar 61Mar 61
Published inData Engineer ThingsKimball Dimensional Modeling OverviewIs it still valid?Feb 272Feb 272
Published inData Engineer Things8 minutes to understand PrestoUber, Netflix, Airbnb, and LinkedIn use this query engine.Feb 203Feb 203
Published inData Engineer ThingsLet’s build a data platform like Spotify!How they build and what can we learn.Feb 62Feb 62
Published inData Engineer ThingsMy Uncensored Guide To Saving on Cloud Data Warehouse CostsIf you follow and burn your billing, it’s not my fault.Jan 301Jan 301
Published inData Engineer ThingsI spent 6 hours learning AWS Glue. Here is what I foundThe cloud-native and robust data integration tool.Jan 235Jan 235
Published inData Engineer ThingsThe History of Data EngineeringThe most comprehensive one you’ve ever found on the internetJan 189Jan 189