Supporting Multi-Region Deployments in the Hybrid Cloud

Releasing Microservices Efficiently and Reliably at Scale

Here at HomeAway, we strive to provide a highly available hybrid cloud platform to ease the operations burden for product-focused developers. The platform currently supports three distinct runtime environments (test, stage, production), each containing a set of physically isolated data centers defined as regions. While maintaining an ecosystem with such a large amount of physical and logical isolation enables important features such as high availability deployments and geo-aware routing, it is difficult to interact with such a distributed set of services.

As a quick end-user example, consider how one might handle deploying an active-active set-up across six availability zones within two regions. Each scheduler is logically restricted to its resource pool within the availability zone, and so six calls to the six unique container schedulers are required to deploy the requested application. Doing this manually for every new versioned release of a deployment would quickly lose viability, especially for development teams that release updates multiple times per day. Additionally, having each development team roll their own automation around this kind of management rapidly reduces the stability of the platform altogether.

In order to centralize some of the deployment-level management inherent to a cloud platform, we created the Ministry of Truth.

University of London’s Senate House, architectural inspiration for Orwell’s 1984 Ministry of Truth.

The Ministry of Truth (MoT) consists of a collection of microservices running in every region responsible for distributing and consolidating events pertinent to deployments and container orchestration within a hybrid-cloud platform. I’ll provide insight into the rules of the game and how we accomplish this task in a production-isolated infrastructure.

Problem space

Given a central datastore, a central API, and several collections of microservices, provide conventions so that messages may be relayed between all three parts in an eventually consistent manner. This post will go into how MoT forwards user requests to regional agents, sends messages between microservices, and pushes the data to be stored through a persistence flow. We seek to avoid implementation specifics or application details, but may do so for the sake of providing a comprehensive example.

Agents

In general, we strive to build our agents to perform one simple, lightweight action, triggered by an event from a data source and potentially publishing the result to a corresponding data sink. For example, we gather data from Consul, our platform’s service-discovery component. In order to accomplish a part of the data gathering, the InstanceConsulStateAgent does the following:

  1. Read in ConsulAppState from the ConsulStateAgent
  2. Convert the ConsulAppState data into a collection of InstanceInfo records
  3. Compare each InstanceInfo record with the previous record by key
  4. Publish the latest record downstream on creation, update, or deletion by key
Partial view of MoT consul flow

In short, this lets us know if the Consul service data has changed, separating the filter from the downstream persistence components. We have similar flows for some of the other primary platform services, Marathon and Mesos. The MoT microservices are collections of semantically similar agents bundled into Dropwizard apps and grouped with other utilities or models shared among the agents. Some examples include:

  • mot-deployment-agents: responsible for sending deployment requests to the scheduler, grouping data from other sources into deployment state, mending deployments, etc.
  • mot-consul-agents: responsible for collecting data from Consul, enabling traffic to route to services in the catalog, etc.

The microservices themselves can be bundled as well. Let’s take a look at a high level overview of the three major MoT layers.

A Meal in 3 Courses

You can think of MoT as having three primary components with special communication channels between them:

  1. The centralized API / service layer that handles all database reads and some writes in addition to any API requests from users.
  2. The regional component specific to the datacenter. These are the local microservices that handle implementation details. In the case of MoT, this is primarily multi-region container orchestration and multi-source data aggregation.
  3. The persistence layer consisting of archival agents that store any data supported by the API endpoints surfaced via the API.
A three part view of MoT

Some flows such as configuration updates only ever reside in the service layer, as there is no need to interact with the regional services or deployments. Others, such as enabling traffic for an existing deployment heavily involve all 3 components of MoT.

Kafka Etiquette

In order to transfer data between the channels, we use Kafka topics as the primary event bus. Let’s take a brief glimpse at the rules of thumb, and then perform a deep dive into the flow of a deployment request to get an example of how it all integrates.

In a multi-region architecture with a centralized API, we must forward information across regions. We opted to incur the cross region penalty when consuming from topics in a remote cluster. More specifically, we follow the three rules below.

Never produce to a topic outside of your datacenter

When an event has been processed by an agent, we want to push the result record to a topic as quickly as possible, and move on to the next record. This rule keeps us from making unnecessary connections to foreign regions and adding latency between each processed event.

Consume from the local Kafka cluster whenever possible

There are many topics within the MoT architecture that service communication between microservices or agents in the regional Kafka cluster. If the data stream doesn’t require outside information from other regions, stay local.

Put an identifying suffix on topics to be consumed from a foreign datacenter

Some messages must be propagated to the other regions. In this particular instance, we append a suffix of-

-<appEnvironment>-<region>

Tying it all together with an example:

Let’s take a look at how creating a new deployment in test-us-east-1 takes place throughout the system.

Requesting a deployment:

  1. A user sends a POST to mot.homeawaycorp.com/deployments to deploy an app to the test-us-east-1 region.
  2. The MoT service layer will validate the structure of the deployment request, producing it to the region-specific topic mot-deployment-launch-events-test-us-east-1 in the production-us-east-1 Kafka cluster.
  3. The central→regional mirrormakers will consume the region-specific topics from the production-us-east-1 Kafka cluster then produce records to the stage-us-east-1 Kafka cluster.
  4. The MoT Regional Layer deployed in test-us-east-1 will consume records from the mot-deployment-launch-events-test-us-east-1 topic in the stage-us-east-1 Kafka cluster, perform some business logic, then produce to the mot-deployment-complete-events topic in the test-us-east-1 Kafka cluster.

As a short aside, let us clarify the reasoning for including the stage-us-east-1 Kafka cluster at all. We elected to mirror topics from the central production cluster to a pseudo-central cluster in the non-production environments. This limits the number of firewall exceptions that violate the production | non-production boundary. Furthermore, using mirrormakers at all violates our first rule of Kafka etiquette, as we must mirror the source record to a foreign region’s cluster. Breaking this rule from a central cluster to the non-prod central cluster limits the number of foreign regions for which we break Kafka etiquette.

Great! So now we’ve successfully taken a launch request from a user hitting an API in production-us-east-1 and piped it to a launch request against the regional scheduler in test-us-east-1. The record produced to the mot-deployment-complete-events topic marks the end of the deployment request flow as triggered by the user’s request. However, the story does not stop there. If we were to take a look at the deployment’s dashboard for the app, there would not be much to see. The only data persisted so far was a metadata shell storing some information pulled from the initial request in the DeploymentOperationAgent’s business logic. Let’s take a look at the persistence of instance data and deployment state based on data collected from the regional systems and services.

Persisting deployment state to the datastore-

  1. Marathon, Mesos, and Consul are continually polled for data about the state of an instance. For example, this state data might include the runtime host and port of an AppInstance, the most recent HealthCheckResult blob or the set of Consul tags associated with the service. The MoT regional layer aggregates and transforms these data streams into MoT models meant to be persisted. The records to be persisted on produced to specific topics, for example: mot-deployment-state-change in the test-us-east-1 Kafka cluster.
  2. Preconfigured regional→central mirrormakers on the same hosts as the central→regional mirrormakers mirror the topics from the regional Kafka clusters to the production-us-east-1 Kafka cluster. The destination topic has a suffix of the .<appenv>-<region> and thus is named mot-deployment-state-change.test-us-east-1 in the production-us-east-1 Kafka cluster.
  3. The MoT persistence layer has a MultiRegionAgentFactory pattern that spins up an ArchivalAgent for each region in MultiPaaS. For our example, we will have a DeploymentStateArchivalAgent with a target region of test-us-east-1. The DeploymentStateArchivalAgent consumes from the mot-deployment-state-change.test-us-east-1 topic in the production-us-east-1 Kafka cluster and persists the data to the Cassandra cluster in the production-us-east-1 region.

Awesome! Now if a user were to do a GET on the supported deployment API endpoints, they would expect to see the aggregated result of any instance and deployment state collected and persisted by the above flow. Now we’ve seen patterns of both dispersing information from our central region to our regional services and collecting it from the regional services back into the central region.

The Design is Simpler in Production

Now that we’ve looked at how MoT propagates data over the production | non-production boundary, we can take a look at the simpler flow of requests between two production regions.

Production-to-production communication no longer necessitates the use of mirrormakers. Each time a record needs to be pulled between Kafka clusters, the destination region’s consumer can poll the source cluster and localize the data for further processing.

The data persistence flow between production regions is likewise simplified:

Conclusions

I hope this blog has served you well. In decoupling the persistence, service, and regional layers, we’ve allowed for reduced blast radii in outage scenarios and minimized the amount of data we must send between regions. Best of luck with your journeys in the hybrid cloud!