PinnedPublished inData Engineer ThingsAutoMQ: Achieving Auto Partition Reassignment In Kafka Without Cruise ControlAutoMQ’s stateless brokers and its self-balancing featureNov 202Nov 202
PinnedPublished inThe Deep HubHow AutoMQ Reduces Nearly 100% of Kafka Cross-Zone Data Transfer CostProducing data with the broker in the same availability zone with S3 WAL, self-balancing, and leveraging Kafka rack-awarenessOct 221Oct 221
PinnedPublished inThe Deep HubHow do we run Kafka 100% on the object storage?Let’s see how AutoMQ makes this dream come true.Aug 275Aug 275
PinnedPublished inData Engineer ThingsI spent 8 hours learning Parquet. Here’s what I discoveredI finally sat down and learned about it.Aug 2419Aug 2419
PinnedPublished inData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at UberMar 2319Mar 2319
Published inGoogle Cloud - CommunityHow does Vortex, the BigQuery storage engine work behind the scenes?Vortex: The BigQuery’s Stream-Oriented Storage Engine (Part 2)2d ago2d ago
Published inGoogle Cloud - CommunityI spent 4 hours learning the architecture of BigQuery’s storage engineVortex: The BigQuery’s Stream-Oriented Storage Engine (Part 1)6d ago16d ago1
Published inData Engineer ThingsI spent 4 hours learning how Netflix operates Apache Iceberg at scale.Iceberg The Backbone At Netflix Data Platform ArchitectureNov 302Nov 302
Published inData Engineer ThingsHow does Netflix ensure the data quality for thousands of Apache Iceberg tables?The Write-Audit-Publish pattern with Iceberg BranchesNov 262Nov 262
Published inData Engineer ThingsI spent 8 hours relearning the Delta Lake table formatThe format, Read/Write process, Concurrency, Data Mutation and moreNov 23Nov 23