Shubhodaya HampiholiMonitor costs using System Catalog tables in DatabricksCloud costs can grow significantly if there are no proper monitoring mechanisms in place to ensure timely action on ill-performing jobs…Oct 61Oct 61
Shubhodaya HampiholiWhy Writes in Cassandra are fast ?In this article we will deep dive into the steps involved as part of a write operation in Cassandra and the features which enable it to…Oct 5Oct 5
Shubhodaya HampiholiWhat is Databricks LakeFlow ?A unified, intelligent solution for data engineeringJul 2Jul 2
Shubhodaya HampiholiImplement Stream Data Processing using Databricks Autoloader and continuous workflow.Introduction: This article provides an end-to-end guide to implement a continuous streaming data intake and processing workflow using…Mar 201Mar 201
Shubhodaya HampiholiComprehensive Guide on Databricks Performance OptimizationAs part of this article I have tried to cover various Spark and Databricks performance optimization strategies. This article is to provide…Jan 16Jan 16
Shubhodaya HampiholiStreaming Data Ingestion with Databricks Auto LoaderUse case: As part of out Data Ingestion framework we wanted to adapt to a robust, scalable and reusable ingestion mechanism which can cater…Nov 27, 2023Nov 27, 2023
Shubhodaya HampiholiData Cataloging using PyApacheAtlas and Microsoft PurviewUse case: As part of our Data landscape, we wanted to have an unified and centralized capability which would allow searching for a…Oct 23, 20231Oct 23, 20231
Shubhodaya HampiholiData Reconciliation using Apache Spark on Azure DatabricksUse case: As part of Platform Modernization and migration from Azure Gen1 to Gen2, we wanted to have a data reconciliation tool which would…Oct 11, 2023Oct 11, 2023
Shubhodaya HampiholiConfiguration driven Data Lifecycle Management policies for Azure Storage AccountsUse case: We wanted to have a configuration driven data lifecycle management policy framework which could be used to apply different…Oct 5, 2023Oct 5, 2023