PinnedPublished inBetter ProgrammingI Asked ChatGPT to Build a Data Pipeline, and Then I Ran ItYour job might be safe. For now…Apr 5, 202310Apr 5, 202310
Pinned5 Books that Make You a Better Data EngineerAnd one book that you didn’t know you neededJul 1, 202210Jul 1, 202210
Things I Enjoyed Last Week (#1)I deeply enjoy the process of collecting and then sharing little bits of what I’ve learned. The result of collecting these little pieces of…Jan 27Jan 27
Use Rust to Write Spark AppsUntil Spark 3.4, developing and deploying a Spark application was sometimes a big hassle. Getting Spark running locally for development…Jul 3, 2024Jul 3, 2024
Published inBetter ProgrammingWriting PySpark UDFs in RustPerformance comparison of different UDF methodsMay 2, 2023May 2, 2023
Passing the Databricks Professional Data Engineer ExamHow to prepare for the Databricks Data Engineer Professional ExamMar 27, 20234Mar 27, 20234
Published inBetter Programming4 Delta Lake Metadata Queries You Should KnowGet all the metadata you could want about your LakehouseOct 26, 2022Oct 26, 2022
Published inBetter ProgrammingBuild a FastAPI on the LakehouseCreate a FastAPI on the Databricks Lakehouse with full CRUD capabilitiesJul 18, 20222Jul 18, 20222
Published inBetter ProgrammingHow to Integrate Great Expectations with DatabricksGet better data quality metrics with one change to Great ExpectationsJul 7, 20223Jul 7, 20223
3 Methods for Dealing with Bad Data QualityDesign to handle bad data before it poisons your platform!Jun 30, 2022Jun 30, 2022