At Opendoor, data is central to everything we do. It follows, then, that the platform that manages all of that data is mission critical. Our analytics platform is built on Snowflake and runs hundreds of thousands of queries per day, answering questions and crunching through terabytes of data for our users.
Snowflake is a great tool for this job, as it manages this load with ease and can instantly scale up and down to handle spiky demand. But how do you effectively manage Snowflake itself?
This post outlines how and why we moved from configuring our Snowflake account manually to…
If you’re interested in using Airflow to orchestrate workflows, there are a number of good introductory tutorials out there, including the one in the Airflow documentation itself.
However, most of these tutorials focus on the fundamental concepts of the tool (e.g., What is a DAG? How do I define dependencies between tasks?), and miss a critical aspect of getting started with Airflow — how do you actually get your own Airflow deployment running and ready to handle real-world workloads?
This series aims to guide you through the answer to that question. By the end, you’ll have:
Deep Work and Data