Taming Service-Oriented Architecture Using A Data-Oriented Service Mesh

Adam Miskiewicz
Nov 10, 2020 · 6 min read

Introducing Viaduct, Airbnb’s data-oriented service mesh

By: Raymie Stata, Arun Vijayvergiya, Adam Miskiewicz

At Hasura’s Enterprise GraphQL Conf on October 22, we presented Viaduct, what we’re calling a data-oriented service mesh that we believe will bring a step function improvement in the modularity of our microservices-based Service-Oriented Architecture (SOA). In this blog post, we describe the philosophy behind Viaduct and provide a rough sketch of how it works. Please watch the presentation for a more detailed look.

Massive SOA Dependency Graphs

This particular dependency graph happens to be from Airbnb, but it’s not uncommon. Amazon, Netflix, and Uber are examples of those that shared similarly tangled dependency graphs.

These dependency graphs are reminiscent of spaghetti code, just at the microservices level. Similar to how spaghetti code becomes harder and harder to modify over time, so does spaghetti SOA. To help manage the larger number of services inherent in a microservices-based architecture, we need organizing principles as well as technical measures to implement those principles. At Airbnb, we undertook an effort to find such principles and measures. Our investigations led us to the concept of a data-oriented service mesh, which we believe brings a new level of modularity to SOA.

Procedure- vs Data-Oriented Design

Starting in the ’80s, the paradigm shifted to organizing software primarily around data, not procedures. In this approach, modules define classes of objects that encapsulate an internal representation of an object accessed via a public API of methods on the object. Languages such as Simula and Clu pioneered this form of organization.

SOA is a step back to more procedure-oriented designs. Today’s microservice is a collection of procedural endpoints — a classic, 1970s-style module. We believe that SOA needs to evolve to support data-oriented design, and that this evolution can be enabled by transitioning our service mesh from a procedural orientation to a data orientation.

Viaduct: A Data-Oriented Service Mesh

At Airbnb, we are using GraphQL™️ to build a data-oriented service mesh called Viaduct. A Viaduct service mesh is defined in terms of a GraphQL schema consisting of:

  • Types (and interfaces) describing data managed within your service mesh
  • Queries (and subscriptions) providing means to access that data, which is abstracted from the service entry points that provide the data
  • Mutations providing ways to update data, again abstracted from service entry points

The types (and interfaces) in the schema define a single graph across all of the data managed within the service mesh. For example, at an eCommerce company, a service mesh’s schema may define a field productById(id: ID) that returns results of type Product. From this starting point, a single query allows a data consumer to navigate to information about the product’s manufacturer, e.g., productById { manufacturer }; reviews of the product, e.g. productById { reviews }; and even the authors of those reviews, e.g., productById { reviews { author } }.

The data elements requested by such a query may come from many different microservices. In a procedure-oriented service mesh, the data consumer would need to take these services as explicit dependencies. In our data-oriented service mesh, it is the service mesh, i.e., Viaduct, not the data consumer, that knows which services provide which data element. Viaduct abstracts away the service dependencies from any single consumer.

Putting Schema at the Center

Among other things, using the central schema to define our APIs and database schemas will solve one of the bigger challenges of large-scale SOA applications: data agility. In today’s SOA applications, a change to a database schema often needs to be manually reflected in the APIs of two, three, and sometimes even more layers of microservices before it can be exposed to client code. Such changes can require weeks of coordinating among multiple teams. By deriving service APIs and database schemas from a single, central schema, a database schema change like this can be propagated to client code with a single update.

Going Serverless

Viaduct has a mechanism for computing what we call “derived fields” using serverless cloud functions that operate on top of the graph without knowledge of the underlying services. These functions allow us to move transformational logic out of the service mesh and into stateless containers, keeping our graph clean and reducing the number and complexity of services we need.

Conclusion

Viaduct started powering production workflows at Airbnb over a year ago. We started from scratch with a clean schema consisting of a handful of entities and have grown it to include 80 core entities that are able to power 75% of our modern API traffic.

As mentioned in the introduction, more details on the motivation and technology behind Viaduct can be found in our presentation.

Airbnb Engineering & Data Science

Creative engineers and data scientists building a world…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store