Arvind PantMasking PII (Personally Identifiable Information)As a data engineer, you’ve likely encountered databases or tables to which you do not have access without obtaining multiple approvals…May 29May 29
Arvind PantSpark Job Performance Issue: Debugging & SolutionIntroduction: In the realm of big data processing, Spark jobs are often hailed for their efficiency and scalability. However…May 1May 1
Arvind PantinTowards DevClickhouse IngestionIntroduction to Clickhouse ClickHouse stands out as a high-performance columnar database tailored for rapid data retrieval and efficient…Apr 6Apr 6
Arvind PantExploring Delta Lake Optimization ParametersDelta Lake offers several optimization parameters that enhance the performance and efficiency of data storage and processing. In this blog…Mar 30Mar 30
Arvind PantinDevOps.devSpark Streaming Jobs: Monitoring and Alerting for Silent Failure Part 2This is the second part of the series. In the first part, we explored how to integrate metrics into our Spark streaming code. In this…Mar 17Mar 17
Arvind PantSpark Streaming Jobs: Monitoring and Alerting for Silent Failure Part 1Introduction: In the fast-paced world of data streaming, ensuring the reliability of your streaming jobs is crucial. However, there are…Mar 9Mar 9
Arvind PantDatabricks Snowflake Connector: Snowflake to Delta Transfer IssueThis is very interesting issue we faced during data transfer from Snowflake to Delta using Snowflake ConnectorMar 2Mar 2