Published inData Engineer ThingsThe ultimate test of your Docker Image: Running in GitHub ActionsI thought it would be simple…Oct 91Oct 91
Delta Lake Data GenerationEasy way to generate production-like records and save them into Delta Lake format.Oct 2Oct 2
Data Pipeline Testing: Iceberg Data GenerationGenerate millions of production-like records and load them into Iceberg tables with ease.Oct 2Oct 2
Parquet Data GenerationGenerate millions of production-like records and save them into Parquet files with ease.Oct 1Oct 1
Published inData Engineer ThingsData Contracts in Action: ToolsSome people have asked me “Are data contracts really a thing?”, soon followed up by “What tools are available when using data contracts?”Sep 30Sep 30
Data Contracts in Action: TestingThere has been a lot of talk about data contracts, but not much action. Let us see them in action when applied to testing data pipelines.Jul 17Jul 17
Published inData Engineer Thingsinsta-infra: Single command quickstart for any toolThe simple way to spin up any tool on your local laptop.Jun 12Jun 12
File Formats Unveiled: Exploring the World of File Format DifferencesSo you want to store data in a file. But which file format suits your use case? What are the differences between Parquet and ORC? Delta…May 11May 11
As a data engineer, I was overwhelmed with the number of open-source technologies.Compare the differences between technologies and tools such as file formats, job orchestrators and more at Tech Diff.May 6May 6
Annoying Bug of The Day: HTML Form SubmitTLDR: Beware of buttons in forms in HTML. Set <button type="button">.May 5May 5