Building Services at Airbnb, Part 4

Junjie Guan
Sep 30, 2020 · 11 min read

In the fourth post of our series on scaling service development, we dive into building Schema Based Testing Infrastructure for service development in Airbnb.

By Junjie Guan, Rui Dai and Xing An


While this new architecture solves some fundamental problems of a monolithic behemoth code base, it imposes new challenges that hinder engineering productivity and quality. Here we are happy to share some work we shipped in 2019.

Why schema-oriented infrastructure is critical for service testing

1. Breaking API Changes

“Adding a new data field broke the listing availability service’s API xxx before the corresponding service code changes got deployed. The problematic deploy got automatically rolled back after the errors shot up.”

“Schema field deprecation should only happen after we are sure the fields are no longer used. However, a change to remove fields that are used in xxx was deployed first with no gem upgrade on the xxx side later. The backward incompatible change broke xxx service.”

2. Lack of Usable API Mock Data for Service Consumer

It also leads to engineers experiencing unnecessarily heavy weight test setup. For example, spinning up dependent service and even dummy databases. Those heavy efforts are counter-productive. Tests are supposed to be light and frequently run.

3. Lack of Assurance for Mock’s Semantic Correctness

4. Lack of Validation for Service Owner

However, without framework level validation support, service owners are finding their own heavy lifting to set up validations against their API endpoints, which can be different from the set of validations used by the API consumers.

5. Lack of Real Time Metrics for API Test Quality

6. Most Importantly: Test in a Lightweight and Scalable Manner

To tackle the above challenges in a lightweight and scalable manner, we find that service schema is the keystone.

Note that, there are so many critical components needed to make a successful testing story, organization-wise education and best practices, a scalable test runner, continuous integration infrastructure(CI), continuous delivery infrastructure(CD), testing environments, etc. In this post, we will be only focusing on the schema aspect of service API testing.

How service owners and consumers benefit

For service owners:

  1. Static API Schema Validation: Automated backwards compatibility check & schema linter. Catches breaking API changes as early as possible. Enforces schema best practices at scale
  2. API Integration Testing Framework (AIT):It verifies API endpoints behavior without writing boilerplate. It provides real time API Integration Test Coverage Metrics

For service consumers:

  1. API Mocking Framework: It uses near real-world mocking API transparently in both unit tests and lightweight integration tests. No need to create mock clients or endpoints for upstream services.
  2. API Integration Testing Framework (AIT): Uses the same API mock data used by service owners to ensure that the semantic stays in sync with service implementation.

Let’s dig a little deeper into how each component works.

I. Static API Schema Validation

  • Backwards Compatibility Check focuses on detecting potential API breaking changes.
  • Schema Linter aims at enforcing our IDL best practices.

The two checks are very similar in that they both require construction and traversal of Abstract Syntax Tree(AST) for all changed schemas. We created a linter binary that can be easily plugged into different build systems. (e.g., Java and Ruby).

Figure: Flow charts that describe how Schema Validation works

The activity diagram for the check itself is shown below. Note that we’ve consolidated both checks into one diagram to better reflect their difference.

Figure: Implementation of Schema Validation

With the capability to analyze schema AST trees, now we are able to capture both breaking API change as well as updates not following our best practices.

Benefits of Schema Validation

Before the emergence of the automated tooling, both checks were conducted via manual code review, which does not scale and could cause issues.

Now with static schema validation, we are able to detect bad API changes and prevent them before code merge (e.g., field type change, field id change).

Figure: Dashboard of detected API breaking change

2. Enforce schema best practices at scale

We are able to enforce API in house best practices with zero human intervention. One simple example is creating a new enum type without value, which is a ticking time bomb: Serialization error could occur if the enum name and value shift unintentionally.

This kind of error is very hard and expensive to detect in testing. Thanks to Static Schema Validation, it now can be captured early and easily in CI.

II. Schema-based API Data Factory

If schema is the silhouette of an API, then API Data Factory is the figure. It materializes the API with meaningful data. It provides two fundamental capabilities:

  • Mocking dependent service API in unit tests and integration tests.
  • Validating API endpoint correctness using the materialized API request/response.
Figure: With these fundamental abilities, we built frameworks that helps both API consumers and API producers

Fixture File Implementation

Figure: The structure of fixture files under a service

To give our readers an intuitive understanding of the fixtures data, below is a simple example.

Figure: simple example of what a fixture file looks like

API Data Factory Features

It stores language agnostic API data as yaml that can be embedded in and consumed easily by any program. It also supports extended features for various usages. (e.g., shared fixture data, flexible matching, data assertion annotation)

2. Automated Schema Validation

3. Shared API Data for Tests

Handcrafting API data for testing is hard and tedious for both service owners and customers, especially when API data is deeply nested. Creating a shared data factory helps with data reuse among producers and customers.

4. Automated Data Generation

Furthermore, we built tools to generate fixture data easily based on anonymized production traffic, so as to eliminate any manual data creation and boost productivity. This will also help make sure the mock data examples have real life usage.

5. API Test Data Quality Metrics

Because data factories are schema based, we can potentially measure the quality of API Test Data, for example, how many API fields are validated. This gives us a sense of mock data sufficiency and API Test Data coverage.

III. API Mocking Framework

It is used in these two places:

  • Unit test: white box testing a piece of logic that contains calling dependent service API, without setting up its dependent services.
  • Shallow integration test: black box testing a service, without setting up its dependent services.
Figure: How API Mocking is utilized in various kind of tests

API Mocking Framework Features

In addition to embedding mock clients, you can also interact with a live service application in “mock mode” to get fake API data in a reliable and cheap manner.

This kind of interaction is often needed when a RPC client is not available. For example in the frontend/mobile domain, mock data could be used to test page rendering instead of going through the pain of making a chain of live API calls.

Whenever an API mock is needed (unit tests or integration tests), there is no need to create a fake client or fake service. A test program can interact with in-house auto-generated RPC clients as if it is making real service calls. The mocking mode is enabled under the hood with a config flag or request context header.

Figure: The API Mocking transparently wrap both RPC client and Service API Layer, such that mocking can be enabled anywhere

2. Dependency Isolation for Shallow Integration Tests

To re-emphasize our goal, we are trying to avoid a hidden monolith in our microservices architecture: When testing a service, we do not want to test components beyond the service itself. We call this a shallow integration test.

API Mocking Framework provides a clean solution for dependency isolation. It is currently used in places like functional integration tests, or isolated load testing to capture possible regression in a single service.

3. Various Mocking Options as Needed

Mocking mode can be turned on either via config or HTTP header.

  • Matching: When mock mode is on, from the user perspective, it is the same as interacting with a real service. Under the hood, the mocking framework will find the first fixture request that qualifies as the superset of the submitted request.
  • Stubbing: Before submitting the request, you can stub which fixture response should be returned.
  • Latency Simulation: You can configure a simulated latency spike for each mock interaction.

IV. API Integration Testing Framework (AIT)

What is AIT?

Figure: work flow of AIT

The simple flow works as below:

  1. Service owners define a few configurations which associate validation with predefined test data (defined in API Data Factory)
  2. The targeted running service will expose an API Validation Gateway as a system endpoint, which will route to corresponding auto-generated endpoint validation requests.

Benefits of AIT

Service endpoint correctness is the building block of other complicated end-to-end testing according to the test pyramid. Ideally, the clean information needed to test an API endpoint in this schema-driven world should only be pure API test data representation, without extra overhead for test setup. As a result,

Note that in producer service validation, the dependencies can still be mocked out as an option. Therefore, we can make endpoint validations in a super lightweight way.

2. Mocking data is semantically validated

By leveraging API Data Factory, the same test data defined for unit tests or other mocking scenarios can be reused.

At the same time, producer services validate themselves using the same predefined test data with embedded assertion logic, which further ensures semantic and logical correctness. If an unexpected live response is caught in the deploy pipeline, the provider service is locked from deploying till the issue is resolved.

Figure: A simplified example of AIT

Such mechanism helps us kill two birds with one stone:

  • In the fixture response above, the annotated field is always valid and obeys the assertion which makes this piece of data always valid for mocking purpose.
  • In real responses, the same assertion logic must also pass, which makes it consistent between production and mock.

3. Provide realtime API test coverage instrumentation

Without the otherwise manual and repeated compilation, AIT instruments realtime API Test Coverage metrics, which not only helps people understand the real-time health of their tests, also provides clear insights and incentives for provider service team members to improve their API test coverage overtime.

Above is a real example for one of the teams at Airbnb, where we can see the real time stats and a super clear trajectory towards better API testing coverage. With such instrumentation, we can clearly understand not only which services but also which endpoints are acting correctly in different environments.

4. Pluggability and Lightweight

As a consequence of its lightweight and self-contained nature, AIT can be easily integrated with test runners at different CI/CD stages. Such abstractions make API validation less coupled from the external world and makes it easier to evolve other parts of our infrastructure.

Summary and looking forward


The Airbnb Tech Blog

Creative engineers and data scientists building a world…