Vindhya G – Medium

Vindhya G

Vindhya G

Exploring Spark Thrift JDBC/ODBC Server: Purpose and Overview

Exploring Data Engineering can feel overwhelming with its array of technologies and concepts. Today, let’s focus on one key aspect within…

Mar 23

Exploring Spark Thrift JDBC/ODBC Server: Purpose and Overview

Mar 23

Vindhya G

Apache Flink Vs Apache Spark: Design Distinctions and their Implications

As I started learning about Flink after becoming quite skilled with Spark, a key question bothered me: What sets Flink apart from Spark…

Aug 13, 2023

Apache Flink Vs Apache Spark: Design Distinctions and their Implications

Aug 13, 2023

Vindhya G

Apache Flink 1.17.1: Stream and Process Kafka Events using Table API

As promised in the earlier article, I attempted the same use case of reading events from Kafka in JSON format, performing data grouping…

Jul 20, 2023

Apache Flink 1.17.1: Stream and Process Kafka Events using Table API

Jul 20, 2023

Vindhya G

Apache Flink 1.17.0: Streaming JSON Events from Kafka -Complete Sample Code

When I initially delved into Flink, I faced a challenge in comprehending the process of running a basic streaming job. My goal was to read…

Jul 19, 2023

Apache Flink 1.17.0: Streaming JSON Events from Kafka -Complete Sample Code

Jul 19, 2023

Vindhya G

Optimizing Shuffle Operations in Apache Spark Structured Streaming: Key Considerations

One of the often asked questions in Spark is why high memory-to-data size ratio is observed. It is not uncommon for a batch size of 1GB to…

Jun 25, 2023

Optimizing Shuffle Operations in Apache Spark Structured Streaming: Key Considerations

Jun 25, 2023

Vindhya G

Stateful processing in Spark Structured Streaming — Troubleshooting Java OOM heap space error

In earlier days of working with spark structured streaming be it an application with a flatmapgroupwithstate or an application with just an…

Jan 6, 2023

Stateful processing in Spark Structured Streaming — Troubleshooting Java OOM heap space error

Jan 6, 2023

Vindhya G

How aggregation works end to end in Spark Structured Streaming

While using Spark i learnt a lot of concepts w.r.t to distributed processing starting right from Map-Reduce . While there are so many great…

Nov 5, 2022

How aggregation works end to end in Spark Structured Streaming

Nov 5, 2022

Vindhya G

Dataset.IsEmpty vs rdd.isEmpty() in Apache Spark 2.x.x

Having an efficient spark application with huge dataset + multiple joins and aggregations is always tricky. Specially if you have window…

Jun 12, 2022

Dataset.IsEmpty vs rdd.isEmpty() in Apache Spark 2.x.x

Jun 12, 2022

Vindhya G

Vindhya G

Lead Software Engineer at OutSystems

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams