PinnedVu TrinhinData Engineer ThingsThe Architecture of Apache DruidWhen Hadoop can solve every problem8 min read·Jun 15, 2024--1--1
PinnedVu TrinhinThe Deep HubHow does LinkedIn process 4 Trillion Events every day?Key insights on how LinkedIn leverages Apache Beam for real-time processing7 min read·Jun 10, 2024--4--4
PinnedVu TrinhinThe Deep HubAll you need to know about the Google File SystemHow did Google build its large-scale file system?16 min read·May 12, 2024--6--6
PinnedVu TrinhinData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at Uber19 min read·Mar 23, 2024--16--16
Vu TrinhHow does Uber handle petabytes of Spark shuffle data every day?The Remote External Service (RSS)10 min read·5 days ago----
Vu TrinhinData Engineer ThingsEverything you need to know about MapReduceAll the key insights from the paper MapReduce: Simplified Data Processing on Large Clusters from Google10 min read·Jun 1, 2024--3--3
Vu TrinhinData Engineer ThingsHow Twitter processes 4 billion events in real-time dailyFrom Lambda to Kappa6 min read·May 25, 2024--3--3
Vu TrinhinData Engineer ThingsThe Hadoop Distributed File SystemEverything you need to know about the HDFS14 min read·May 25, 2024----
Vu TrinhinData Engineer ThingsI spent 5 hours understanding more about the Delta Lake table formatAll insights from the paper: Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores17 min read·May 4, 2024--2--2
Vu TrinhGroupBy #33: Data Gateway — A Platform for Growing and Protecting the Data Tier at Netflix, The…Plus: Solving RevenueCat’s data ingestion challenges into Snowflake, From ZooKeeper to KRaft: How the Kafka migration works6 min read·May 3, 2024----