Part 1 — Rebuilding at Scale

Authors: Jonathan Parks, Vaughn Quoss, Paul Ellwood

Image for post
Image for post

Introduction

At Airbnb, we’ve always had a data-driven culture. We’ve assembled top-notch data science and engineering teams, built industry-leading data infrastructure, and launched numerous successful open source projects, including Apache Airflow and Apache Superset. Meanwhile, Airbnb has transitioned from a startup moving at light speed to a mature organization with thousands of employees. During this transformation, Airbnb experienced the typical growth challenges that most companies do, including those that affect the data warehouse. This post explores the data challenges Airbnb faced during hyper growth and the steps we took to overcome these challenges.

Background

As Airbnb grew from a small start-up to the company it is today, many things have changed. The company has ventured into new business areas, acquired numerous companies, and significantly evolved product strategy. Meanwhile, the requirements on our data have also changed. For instance, leadership has set high expectations for data timeliness and quality, and increased focus on cost and compliance. To ensure that we continue to meet these expectations, it was apparent that we needed to make sizable investments in our data. These investments centered around addressing areas related to ownership, data architecture, and governance. …

Jonathan Parks

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store