Thomas LawlessError Handling with Apache Spark Structured StreamingIn today’s data-driven world, real-time data processing is a critical requirement for many businesses. Apache Spark Structured Streaming…Jul 25Jul 25
Thomas LawlessApache Spark Structured Streaming in PySpark with Apache Iceberg & KafkaIn modern data architectures, integrating streaming and batch processing with efficient data storage and retrieval is critical. Apache…Jul 16Jul 16
Thomas LawlessApache Iceberg: Spark SQL vs. Spark DataFramesApache Iceberg is a table format designed for huge analytic datasets, providing efficient data storage and retrieval. When working with…Jun 26Jun 26
Thomas LawlessApache Iceberg Table Maintenance using PySparkApache Iceberg has emerged as a powerful table format for managing large analytical datasets. Its features like schema evolution, time…Jun 19Jun 19
Thomas LawlessBranching & Tagging Apache Iceberg TablesApache Iceberg is revolutionizing the way data is managed. With its robust architecture, Iceberg supports features that were traditionally…Jun 18Jun 18
Thomas LawlessDeveloping with Apache Iceberg & PySparkApache Iceberg and PySpark are powerful tools for managing and analyzing large datasets. Setting up a local development environment is…Jun 17Jun 17
Thomas LawlessPySpark Development with Poetry & PEXManaging dependencies for PySpark applications can be challenging, especially when you want to maintain a clean development environment.Jun 9Jun 9
Thomas LawlessPySpark and Software Engineering Best PracticesPhoto by Pavel Neznanov on UnsplashMay 19May 19
Thomas LawlessEnhancing the Developer Experience for Data Scientists and EngineersWithin the practices of data science and engineering, the quest for insights and innovation hinges on the efficiency and effectiveness of…May 14May 14