PinnedVu TrinhinThe Deep HubHow does LinkedIn process 4 Trillion Events every day?Key insights on how LinkedIn leverages Apache Beam for real-time processing7 min read·2 days ago--1--1
PinnedVu TrinhinThe Deep HubAll you need to know about the Google File SystemHow did Google build its large-scale file system?16 min read·May 12, 2024--6--6
PinnedVu TrinhinData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at Uber19 min read·Mar 23, 2024--15--15
Vu TrinhinData Engineer ThingsEverything you need to know about MapReduceAll the key insights from the paper MapReduce: Simplified Data Processing on Large Clusters from Google10 min read·Jun 1, 2024--2--2
Vu TrinhinData Engineer ThingsHow Twitter processes 4 billion events in real-time dailyFrom Lambda to Kappa6 min read·May 25, 2024--3--3
Vu TrinhinData Engineer ThingsThe Hadoop Distributed File SystemEverything you need to know about the HDFS14 min read·May 25, 2024----
Vu TrinhinData Engineer ThingsI spent 5 hours understanding more about the Delta Lake table formatAll insights from the paper: Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores17 min read·May 4, 2024--2--2
Vu TrinhGroupBy #33: Data Gateway — A Platform for Growing and Protecting the Data Tier at Netflix, The…Plus: Solving RevenueCat’s data ingestion challenges into Snowflake, From ZooKeeper to KRaft: How the Kafka migration works6 min read·May 3, 2024----
Vu TrinhGroupBy #32: Canva — Scaling to Count Billions, Ensuring Precision and Integrity: A Deep Dive into…Plus: LLM fine-tuning and evaluation in BigQuery, How We Built Slack AI To Be Secure and Private7 min read·Apr 28, 2024----
Vu TrinhinTowards Data ScienceThe Stream Processing Model Behind Google Cloud DataflowBalancing correctness, latency, and cost in unbounded data processing14 min read·Apr 27, 2024----