Tracking the Money — Scaling Financial Reporting at Airbnb
At Airbnb, the Payments team is responsible for everything related to moving money in Airbnb’s global marketplace. We build technology that powers Airbnb’s massive daily transaction volume to collect payments from guests and distribute payouts to hosts. Our goal is to make the payment experience on Airbnb delightful, magical, and intuitive.
Historically, the payments team’s focus was to implement new features, currencies, and payment methods to make payments local in a global business. Our sphere has grown to include compliance (sales taxes, earnings taxes, licenses, and more) as well as reconciliation and financial accounting according to generally accepted accounting principles.
Currently, Airbnb’s payment and financial accounting system is a complex ecosystem that transacts in 191 countries, with 70+ currencies and 20+ processors. Not only has Airbnb’s transaction volume experienced exponential transaction growth every year, we have also rapidly increased features and products on our platform. Airbnb hopes to become a premier end-to-end travel service, not only helping people with accommodations but also trip experiences as well.
The challenge to maintain the existing financial accounting system to support new products as well as the increasing data volume has become a “mission impossible” sort of a task.
Airbnb’s Finance Infrastructure engineering team is responsible for delivering accurate, reliable, and comprehensive business/financial data to our stakeholders. In this blog post, we’ll talk about how we manage to keep track of where all of our money is and how it moves in a scalable way in the face of exploding data size and complexity, as well as to support new Airbnb initiatives and payment products. We’ll share the workflow of our deprecated finance system, illustrate its challenges and issues and then describe the new system that we built to replace it.
The prior financial system: a MySQL-based data pipeline
Built in early 2012 and retired in late 2016, our previous financial system was a MySQL data pipeline. It was a parameterized MySQL ETL that ran nightly to provide financial reporting, and served us faithfully for the past few years. The workflow was as follows:
- We enabled MySQL database triggers for all our main tables to be able to capture each change to Airbnb reservation and payment records as they happen in real time on a per row basis. This way, we enforced immutability, and ensured all financially relevant events would be captured. This was meant to be a temporary solution to deal with the inherent mutability of the production data. We acknowledge that it wasn’t a great pattern, but it was necessary at the time.
- By being able to replay the history of events with database triggers, we built a set of intermediate helper tables that assist in calculating different reports. For example, we kept track of recognized revenue, guest receivables, future host payouts (liabilities), and other essential components of financial reporting.
- Based on those helper tables, we built all the financial reporting.
The Scaling Challenge
There were some advantages to this approach. Since we only relied on MySQL DB triggers and MySQL guarantees data accuracy, engineers had lots of flexibility to change the business logic and to move very fast. SQL based reports can be written very fast when the logic is simple.
However, the SQL-based ETL approach was not scaling well:
It wasn’t the right language for the programming model
- SQL is good for lightweight data transformation. It is not designed to handle complicated business data flow. Modern software design principles can not be easily applied to decompose the complexity.
- The original logic was tightly coupled with our core reservation logic. As Airbnb grew, the company needed to support product logic other than the original reservation flow, e.g. we needed to pay professional photographers, translators, and so on. Every product change had a unique money flow, and thus, financial and accounting impacts. As a result, we had to build many additional reports to meet each business need. Because the MySQL logic was structured around reservation logic, it was hard to modify and was extremely error prone when adding new logic for other products.
- Validation and testing became impossible. To get a comprehensive result, we pulled the data separately for all the different reports and our finance team combined them at their discretion. Because we built all the reports separately, it could be very difficult to tell which change in which report caused issues if there were number mismatches. This was not scalable when the company wanted to change the product or add new products more frequently.
- As we added more and more reporting logic, and as our transaction volume grew, developing reports using SQL scripts became more complex, and prohibitively so. It became extremely difficult and time consuming to test the logic for accuracy.
The nightly runs were taking too much time
As our transaction volume grew and our transformation logic became more complex, the nightly pipeline took more time. A relational database is hard to scale up. It is difficult to shard the data and difficult to leverage a distributed system to process a massive amount of data. Towards the end of its life, we were only able to run it every other day due to its runtime of over 24 hours.
As we grow, we needed to be able to cope with dramatically increasing data size and the frequent addition of new Airbnb products and payment channels. Thus, we had two goals for our new system:
- The new system should give us enough flexibility to support more products as well as deal with product changes or accounting logic changes. To do this, we need to decouple the financial logic from the product logic. The representation of those product behaviors can be very generic in our financial system, e.g. how to book the receivable / payable / revenue / tax / etc. Thus we can build very sustainable financial reports based on highly normalized data.
- The new system should scale horizontally. As volume grows, we should be able to scale out our system by just adding machines.
Introducing our new financial reporting pipeline
Our event based financial report is designed to:
- support all current and future product types in our platform
- have a holistic view of all events with financial impact on our platform
- horizontally scale as our transaction volume grows
It is powered by Apache Spark, stored on our HDFS cluster, and written in Scala. Spark is a fast and general engine for large-scale data processing, with implicit parallelism and fault-tolerance. We chose Scala as the language because we wanted the latest features of Spark, as well as the other benefits of the language, like types, closures, immutability, lazy evaluation, etc.
This is huge considering our previous language was SQL.
A brief overview of how it works
Our new financial reporting system has a concept of different product types, of which reservations are only one. Each product type has its own set of platform and payment events, and a corresponding set of financial events. Thus we can address each product type individually and systematically build up to a holistic report.
The system can be thought of as many event handlers that calculate the accounting impact of different products at different points in their life cycle. Because Scala has a strong static type system, while providing full support for functional programming, it is easy to design and write handlers about different products and how to process them.
Below is a diagram of how the data flows through the system. Don’t worry, we’ll explain everything.
Platform events are events that provide information about product related changes, like reservations, reservation alterations, photography, cancellations, etc. These usually have some financial expectation associated with them, but it is not always the case. Each time a product is created or updated, we derive an event for that product. For example, when a reservation is booked, we emit a booking event for the reservation product type. A day after the reservation starts, we consider the reservation to have “services rendered”. When the service is delivered to the customer (the guest in this case), we can then recognize revenue. These events are important because they have financial accounting implications, and we will talk about those more below.
Payment events describe money movement. They can be events where real money moves in and out of Airbnb bank accounts. Payment events also describe stored value, like when someone buys and uses gift cards. There are other kinds of payment events where money may not actually move, but we still have to account for the lack of cash movement. This can be when someone sends someone else a gift card, and that person claims it. We consider those to be balance transfers, or virtual movement. An example of no money movement would be when a guest uses a coupon. The money from the coupon is funded from the marketing budget, but no money has actually moved accounts — we just need to account for it somewhere. Money in must equal money out. Because coupons are on the guest side and don’t impact the original host payout amount and likewise the other host side operations, we need to take these into account so the money equation balances.
These events are currently generated from examining the aforementioned accounting audit rows for changes. If the change has an accounting impact, then a platform or payment event is generated. This system was designed as a central place for all the data to pass through from different systems. The financial reporting system processes these platform and payment events with event handlers, and produces accounting events that describe the accounting impact of those events.
Accounting events are generated by event handlers that build them from the payment and platform events. We introduced this layer of abstraction to represent the relationship between the different platform and payment events for each product, as well as the accounting logic. Sometimes, as you’ll see below, a single platform event can generate more than one accounting event, because it has multiple accounting impacts at different times. These events basically keep track of what happened by assigning a unique identifier, the product type and the product id, to a set of activity. For us, we consider the product type and id to be the smallest accounting unit that we operate on.
From accounting events, we generate the subledger, which is the basis for all of our financial accounting. Each entry is a detailed accounting record that includes information about the time a transaction occurred (payment or reservation booking), the amount, the currency, the direction of the monetary impact (credit or debit) and the account that it impacts.
The subledger is generated using double entry accounting. Double entry accounting allows us to be sure that everything is accounted for properly in the system. This means no money appears or disappears without a source.
Even though each product type may behave differently, we’ve found a generic life cycle that all product types share. Let’s walk through how a reservation would look in this framework.
An event happens that introduces some accounting liability, and a contract is created. No money has moved at this point, but expectations for future money flow are set up here. The accounting events at this point describe the contract that has been created.
- Here, a guest has confirmed a two night reservation for $100. We refer to this time as booking date, and treat it as the contract start date. Of the $100, $90 is the price of the stay, $10 are fees. (This price breakdown is for example purposes only, and does not reflect any real reservation. We’ve also left the host fees out of this to simplify the explanation.)
- The guest and the host have entered into a contract with each other, with Airbnb acting as the platform and payment collection agent. In exchange for staying at this listing for two nights, the guest will pay Airbnb $100, and the host will receive $90 some time after check-in. On the date of the reservation confirmation, we have a guest receivable of $100, and a future host payable of $90, due to the host when we consider the reservation to be fulfilled. These expectations are set up as soon as the reservation is confirmed.
An event’s money flow occurs. This can happen anytime after the contract has been created. The accounting events here describe the direction and for what liability the amount is fulfilling.
- Here, the guest successfully pays $100 for the reservation. We would consider the guest receivable then fulfilled.
The event happens and so the contracted service is fulfilled.
- This is after the check-in time. This is the point in which we would recognize revenue and losses, if there are any. In this example, we didn’t have any. This is also the time that the scheduled $90 payout for the host should be delivered to the host. This means that the future host payable is now a host payable because it is now due.
More money flows may occur.
- This is when we successfully deliver the $90 payout to the host. Now Airbnb’s host payable is $0 for this particular host.
Sometimes, there are alterations on a product or a payment. Examples would be a guest adding dates to a reservation inducing a reservation price change. To properly account for the price differences from an alteration, we “unbook” and “rebook” the reservation entirely at the time of alteration.
- For example, if the reservation was previously booked for $100, and now it is $150 because the guest extended their stay at a later date, we could either book an additional $50 on the later date, or unbook the reservation for $100 and rebook it for $150 on that day. Why do you think we chose to do the latter? It’s because when alterations add up, it’s much easier to just unbook and rebook, instead of computing the delta every time. It’s the cleanest way we can deal with product alterations and data backfills. Just imagine how alterations on a long term reservation would look in our system!
Now that we’ve built the subledgers from the platform and payment events, and their handlers, we can easily query for the financial impact to different accounts that any event generates.
An example of how revenue would be queried from the subledger is as follows:
It scales while maintaining quality and performance
Our financial reporting pipeline scales both on a product basis, and on a runtime basis. We can easily support new products on the financial engineering side because we’ve built a framework around the right abstractions, instead of tying it too closely with one specific product’s life cycle. We can also scale horizontally. This is much better than being limited by an Amazon RDS instance, no matter how beefy it may be. Our nightly runtime is 4–5 hours and has not been growing too much as of March 2017.
It made troubleshooting much simpler
Before, when our Finance team needed comprehensive reports, we pulled the data separately for all the different reports and our finance team combined them at their discretion. Because we built all the reports separately, it could be very difficult to tell which change in which report caused unexpected deviances. This was not scalable when the company wanted to change the product or add new products more frequently. Now when there’s an issue, we investigate data from a single source of truth, significantly simplifying the troubleshooting process.
It made coding much simpler
Going from declarative programming to functional programming has been a powerful paradigm shift for us to think about financial processing and accounting. We can now think of this system as a straightforward actor/handler system rather than getting mired in complicated SQL-join logic.
The nightly runs are timely and well monitored
Originally, the MySQL ETL was scheduled via a crontab, and was dependent on data arriving via a different pipeline. Instead of taking upwards of a day to complete, the nightly run takes around 4–5 hours to complete. We no longer have to babysit a legacy system that quite frequently encounters snags caused by upstream changes, taking hours of developer time each week to resolve.
We built a comprehensive test framework
This is perhaps the important part of the picture. Because our financial processing and reporting is no longer in SQL, we are now also able to write extensive suites of unit tests against specific handlers. Together with integration tests and smoke tests, we can easily identify regressions and other errors. Smoke tests are rules we expect our data to follow and when rule violations occur, they are logged and addressed. This gives us a high degree of confidence in the quality of our data and lets us quickly vet new changes and roll them out. We have built an extensive (and ever expanding) test suite of real transactions, in which we check how we expect them to look in the system individually as well as in aggregate. This test framework is critical when we have time sensitive requests from our various partners in Finance and Legal, as well as from our audit partners, and need to be confident in our reporting.
In the future, we will be moving towards an entirely event-based system. The financial reporting system will consume events emitted from other systems. Stay tuned to read about that in a future blog post. This will help us with even greater financial integrity and a richer vocabulary with which we can express different products and payment flows.
In the end, what we want most at Airbnb is to have is complete, accurate and extensible financial reporting for all of our current and future products at Airbnb. We believe that we have designed the financial reporting system to be a strong foundation of all financial processing at Airbnb. Because of the clean decoupling of business logic and accounting logic, this system is product agnostic, extensible, and future-proof, which gives us confidence it will serve us well for many years to come. This is just the start of our back office financial systems at Airbnb.
If you enjoyed reading this and thought this was an interesting challenge, the payments team is always looking for talented people to join the team, whether you are a software engineer or a data scientist.
Please stay tuned for more on the Payments ecosystem at Airbnb!
Many thanks to Sarah Hagstrom, Lou Kosak, Shawn Yan, Brian Wey, Jiangming Yang, and Ian Logan for reading through many drafts and helping me to write this post.