PinnedPierre COURGEONHow to run Pandas code on SparkSince Spark 3.2, a new library called Pandas on Spark API was integrated into PySpark. We will see why and when it should be used.Jan 31, 2022Jan 31, 2022
PinnedPierre COURGEONEnd-to-end data pipeline tests on DatabricksData pipeline testing is evolving quickly but there is still a big gap to bridge with software engineering best practices. In this article…Oct 10, 20223Oct 10, 20223