VenkatakrishnanChoosing the Right Data Lake: Iceberg vs. Hudi for Transitioning from Data Warehouses to Data LakesIntroductionMay 19May 19
VenkatakrishnanEliminating Duplicates In Real-Time with AWS Kinesis and LambdaWhen dealing with real-time data on high-traffic websites, duplicates can be a significant issue, often leading to inefficiencies and…May 10May 10
VenkatakrishnanPart 2 — Understanding Snowflake’s Architecture:In our first article, we looked at how Snowflake Elastic Data Warehouse has become important in today’s world of data. We talked about how…Dec 6, 2023Dec 6, 2023
VenkatakrishnanPart 1: Snowflake: The Cloud-Native Solution for Modern Data WarehousingIntroduction:Nov 13, 2023Nov 13, 2023
VenkatakrishnanHive’s Evolution: From Append-Only to ACID SupportIntroduction:Oct 17, 2023Oct 17, 2023
VenkatakrishnanUnderstanding HDFS: A Simple Guide to How Hadoop Stores DataHadoop’s HDFS (Hadoop Distributed File System) is a robust and scalable file system specifically designed for distributed storage and big…Oct 13, 2023Oct 13, 2023
VenkatakrishnanHow Apache Spark decides on the join strategyApache Spark uses a cost-based optimizer to decide on the join strategy. The optimizer takes into account a number of factors, including…Sep 18, 2023Sep 18, 2023
VenkatakrishnanDesigning a Scalable, De-coupled Multi-tenant Architecture using CDPIntroductionSep 16, 2023Sep 16, 2023
VenkatakrishnanData Modelling: Techniques, Importance and ImplementationIntroductionSep 11, 20231Sep 11, 20231
VenkatakrishnanApache Spark: Query Plans and Under-the-Hood OperationsIntroductionAug 25, 20231Aug 25, 20231