Self-hosting Airbyte OSS on AWS Elastic Kubernetes ServiceAirbyte is an open-source data integration platform. This article takes you through hosting Airbyte on Kubernetes for your large workloads.Nov 51Nov 51
Self-hosting Prefect on AWS EC2. Managed via Terraform and prefect.yamlHosting Prefect Server on AWS EC2. Deploying scheduled workloads to Docker workersApr 211Apr 211
3 techniques to write highly optimized queries for BigQueryIn the first part of this series on how to optimize BigQuery for faster (and less costly) querying, we looked at how BigQuery separates…Apr 2, 20231Apr 2, 20231
Why You Shouldn’t Use Kafka as a Data Lake, and What To Do InsteadHow do you design your data architecture such that both real-time and historical data are available when and as needed?Mar 21, 2023Mar 21, 2023
4 ways to optimize your BigQuery tables for faster queriesBigQuery is a popular analytical database (OLAP) on the Google Cloud Platform. It’s designed (and optimized) for heavy analytical queries…Mar 9, 2023Mar 9, 2023
The physical structure of an SQL indexThe data structures used to implement an SQL indexNov 29, 2021Nov 29, 2021
Building my own (recursive descent) SQL parser in PythonA validating parser for a subset of the SQL syntaxNov 19, 2021Nov 19, 2021
What happens when you join two tables in SQL?Understanding SQL JOIN strategirs; Exploring what happens under-the-hood when you JOIN tables in PostgresNov 12, 20211Nov 12, 20211