Venkatakrishnan – Medium

Venkatakrishnan

Venkatakrishnan

Choosing the Right Data Lake: Iceberg vs. Hudi for Transitioning from Data Warehouses to Data Lakes

Introduction

May 19

Choosing the Right Data Lake: Iceberg vs. Hudi for Transitioning from Data Warehouses to Data Lakes

May 19

Venkatakrishnan

Eliminating Duplicates In Real-Time with AWS Kinesis and Lambda

When dealing with real-time data on high-traffic websites, duplicates can be a significant issue, often leading to inefficiencies and…

May 10

Eliminating Duplicates In Real-Time with AWS Kinesis and Lambda

May 10

Venkatakrishnan

Part 2 — Understanding Snowflake’s Architecture:

In our first article, we looked at how Snowflake Elastic Data Warehouse has become important in today’s world of data. We talked about how…

Dec 6, 2023

Part 2 — Understanding Snowflake’s Architecture:

Dec 6, 2023

Venkatakrishnan

Part 1: Snowflake: The Cloud-Native Solution for Modern Data Warehousing

Introduction:

Nov 13, 2023

Part 1: Snowflake: The Cloud-Native Solution for Modern Data Warehousing

Nov 13, 2023

Venkatakrishnan

Hive’s Evolution: From Append-Only to ACID Support

Introduction:

Oct 17, 2023

Hive’s Evolution: From Append-Only to ACID Support

Oct 17, 2023

Venkatakrishnan

Understanding HDFS: A Simple Guide to How Hadoop Stores Data

Hadoop’s HDFS (Hadoop Distributed File System) is a robust and scalable file system specifically designed for distributed storage and big…

Oct 13, 2023

Understanding HDFS: A Simple Guide to How Hadoop Stores Data

Oct 13, 2023

Venkatakrishnan

How Apache Spark decides on the join strategy

Apache Spark uses a cost-based optimizer to decide on the join strategy. The optimizer takes into account a number of factors, including…

Sep 18, 2023

How Apache Spark decides on the join strategy

Sep 18, 2023

Venkatakrishnan

Designing a Scalable, De-coupled Multi-tenant Architecture using CDP

Introduction

Sep 16, 2023

Designing a Scalable, De-coupled Multi-tenant Architecture using CDP

Sep 16, 2023

Venkatakrishnan

Data Modelling: Techniques, Importance and Implementation

Introduction

Sep 11, 2023

Data Modelling: Techniques, Importance and Implementation

Sep 11, 2023

Venkatakrishnan

Apache Spark: Query Plans and Under-the-Hood Operations

Introduction

Aug 25, 2023

Apache Spark: Query Plans and Under-the-Hood Operations

Aug 25, 2023

Venkatakrishnan

Venkatakrishnan

Experienced Lead Data Engineer with expertise in SAS Products, SQL, Python, Spark, Hadoop Ecosystem, AWS, Kafka, Data Warehouse, and Agile Methodologies.

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams