PinnedVu TrinhI spent 8 hours learning Parquet. Here’s what I discoveredI finally sat down and learned about it.2d ago12d ago1
PinnedVu TrinhinData Engineer ThingsHow does Uber build real-time infrastructure to handle petabytes of data every day?All insights from the paper: Real-time data infrastructure at UberMar 2318Mar 2318
Vu TrinhinData Engineer ThingsHow did Discord evolve to handle trillions of data pointsFrom in-house solutions to the modern data stack6d ago6d ago
Vu TrinhinData Engineer ThingsHow did Facebook design their Real-Time Processing ecosystemHundreds of GBs per SecondAug 171Aug 171
Vu TrinhinData Engineer ThingsHow Did LinkedIn Handle 7 Trillion Messages Daily With Apache Kafka?Was adding more machines enough?Aug 142Aug 142
Vu TrinhinData Engineer ThingsI spent 4 hours learning Apache Iceberg. Here’s what I found.The table format’s overview and architectureAug 104Aug 104
Vu TrinhinData Engineer ThingsHow does Notion handle 200 billion data entities?From PostgreSQL → Data LakeAug 66Aug 66
Vu TrinhinData Engineer ThingsDiving Deep into LinkedIn’s Data Infrastructure: My 6-Hour Learning & Key TakeawaysThings I distill after reading the paper: Data Infrastructure at LinkedInAug 3Aug 3