Shantanu TripathiDatabase Isolation Levels (`I` in ACID)Isolation in ACID means that each transaction appears to be executed in isolation from other transactions, even though they may be…Feb 3Feb 3
Shantanu TripathiinDev GeniusBehind the Scenes: What Happens After You spark-submit?What exactly happens when you submit a spark job from your terminal in cluster mode? Let’s dive into the steps through which the job goes…Jan 191Jan 191
Shantanu TripathiTroubleshooting Slow Spark Job: 5 Key Areas to InvestigateSpark is supposed to reduce ETL time by leveraging the concept of efficient parallelism. If your job isn’t doing so, let’s discuss 5…Jan 5Jan 5
Shantanu TripathiinDev GeniusAdaptive Query Execution: What problem does it solve?Where does AQE kicks in?Dec 31, 2023Dec 31, 2023
Shantanu TripathiWhy Shuffle is best served as External Service in SparkShuffle?Dec 21, 2023Dec 21, 2023
Shantanu TripathiHow objects travel during Garbage Collection in SparkAsk yourself this question: When garbage collection process starts, does it go through every object in memory to identify which one to…Dec 14, 2023Dec 14, 2023
Shantanu TripathiRedis HyperLogLog Implementation: Measure Unique Visitors on Your Website/PostEver wondered how LinkedIn counts unique visitors on posts (or profile)May 24, 2023May 24, 2023
Shantanu TripathiEasy Docker Setup for PySpark with Jupyter NotebookJust like Databricks but without autocompletionJan 15, 2023Jan 15, 2023
Shantanu TripathiStreaming Dummy Data to KafkaStreaming data is becoming increasingly popular as more and more companies look for ways to process and analyze large amounts of…Jan 14, 2023Jan 14, 2023