Google Cloud’s Traffic Director — What is it and how is it related to the Istio service-mesh?

Iftach Schonbaum
CloudZone
Published in
6 min readApr 16, 2019

For some of you that followed Google Cloud’s roadmap lately, you might have heard of Traffic Director. For the ones who know Istio, this might sound overlapping and confusing (Especially if you used the latest GKE’s Istio Add-on).

In this post I’ll go over what Traffic Director is, how it is related to the Istio service-mesh and what does it mean for the ones who already run a production Istio mesh on GKE.

In this post, I will not cover what Istio or a service mesh is.

Traffic Director is:

“Enterprise-ready traffic management for open service mesh”…

It is a fully managed control plane for a service mesh that enables to control traffic globally, across Kubernetes clusters (managed or not) and virtual machines, with smart traffic control policies. As any service mesh control plane, it controls the configuration of service proxies inside a mesh.

Traffic Director has a 99.99% SLA (when reaching GA, currently in beta), which means you can manage your mesh configurations without worrying about the control plane’s health and maintenance. Traffic director also scales in the background to fit the size of your mesh, so you don’t have to worry about that neither.

At a high-level you can do the following with Traffic Director:

  1. Sophisticated Traffic Management
  • Traffic manipulation such as splitting, mirroring & fault-injection
  • Smart deployment strategies such as A/B and canary in an easy way
  • Request manipulation like URL rewrites
  • Content based routing by headers, cookies and more

2. Build Resilient Services — Global cross-region aware load-balancing with single IP together with service proxies enables low latency, closest endpoint access with failing over to another in case of an issue, including applicative one. Closest endpoint can be another cluster in the same zone, different zone or different region. Additionally, configure resiliency features between services like circuit breaking outlier detections, off-loading that work from developers.

3. Health Checks at Scale — offload proxies’ health-checking inside the mesh with GCP managed health checks reducing mesh-sized health checks.

4. Modernise Non-cloud Native Services — Since it works with VMs as well, it allows you to introduce advanced capabilities to legacy applications as well.

Traffic Director in a global load balancing deployment (cloud.google.com)

The Istio admins among us, might jump and say “well this is a managed Istio control plane”. That’s because Istio supports lots of the features above. (More precisely the Envoy proxy used in Istio does). So yes, with Istio you can achieve lots of the above — but it will include a lot of admin work (especially when extending to more than one Kubernetes cluster and to VMs). Also, the maintenance of the control plane and the entire mesh can have their toll.

So is it indeed some kind of a managed Istio Control plane? Well, not exactly… Overlapping in some way — maybe.

Let me simplify it...

Istio and Google Cloud’s Traffic Director differ by several categories.

SLA & Management

Istio is an open-source project having some production grade support when included in products such as Openshift or IBM Cloud Private, you currently don’t have a public cloud fully managed Istio service. Most of the public cloud deployments of Istio are plain open-source, non-managed, non-SLA deployments — usually installed with the official Istio helm chart.

In the contrary, Traffic Director has 99.99% SLA and is a fully managed service.

Control plane

Istio has three core components: Pilot for Traffic Management, Mixer for Observability and Citadel for Service-to-Service Security.

Traffic Director delivers a GCP-managed Pilot along with additional capabilities mentioned such as global load balancing and centralised health checking.

Scaling the Control Plane

In Istio, the control plane components such as citadel, mixer & pilot are delivered with HPAs (HorizontalPodAutoscalers) — a Kubernetes resource that is in charge of autoscaling of deployments — with default settings. You need to tweak these settings to fit you Mesh in case there is need. You also need to specify PodAntiAffinity rules to ensure the control plane spans multiple Kubernetes nodes.

With Traffic Director the control plane scales with the mesh and you don’t need to worry about it.

API

As for the Beta release, Traffic Director cannot be configured using Istio APIs. You can use GCP APIs for configuration. Both Traffic Director and Pilot use open standard APIs (xDS v2) to communicate with service proxies. Configuring Traffic Director with Istio APIs is in Traffic Director’s Roadmap.

Data plane proxy

Traffic Director uses the open xDSv2 APIs to communicate with the service proxies in the data plane, which ensures that you are not locked into a proprietary interface. This means Traffic Director can work with xDSv2-compliant open service proxies like Envoy. It is important to mention that Traffic Director is tested only with the Envoy proxy, and in the current beta release supports only Envoy versions 1.9.1 or later.

Istio on the other hand, currently ships with Envoy alone, though there are projects like nginMesh which ship an Istio control plane with nginx as the sidecar proxy, but that’s a separate project.

Its worth mentioning the Envoy has a reputation of a leading mesh proxy, designed for service meshes, with high performance and low memory foot print.

Sidecar Injection & Deployment

In both Istio and Traffic Director the proxy can be both on Kubernetes deployments (PODs eventually) and VMs. in both cases for deployment on VMs you are provided with several scripts and files to install the proxy and configure it with the control.

As for Kubernetes workloads, Istio ships out of the box with automatic injection mechanisms (Works with the MutatingAdmissionController) which automatically injects the sidecar proxy to the POD when being created in a namespace labeled for automatic injection or with a dedicate POD annotation.

With Traffic Director you currently need to manually injects the sidecar. and also create from the service a NEG (see GCP Network Endpoint Groups) using annotations so it can be added as a service in Traffic Director.

As creating a MutatingAdmissionWebhook and an injecting service are relatively easy, I am sure automatic injection will come sooner or later to Traffic Director…

Multi-cluster Mesh

In Istio in order to span the mesh over more than one Kubernetes cluster, Istio provides a dedicated chart, named istio-remote, for expanding the mesh. I will not go through that here.

Since Traffic Director is a control plane that lives outside the Kubernetes clusters and adding Kubernetes workloads to it occurs regardless from which cluster, there is no specific walkthrough for spanning the mesh over multiple cluster.

Mesh Observability

Today, Istio ships Kiali — a great mesh observability that helped our customers greatly for debugging applicative issues within a microservices applications. Kiali is evolving all the time, releasing new versions rapidly.

Traffic Director is featured to be able to be observed with more than one tool, including Apache Skywalking.

$ Pricing $

Istio is an open source and free. Traffic Director is currently offered without charge for the Beta release.

“What if I already operate a production mesh with Istio on GKE?”

As mentioned, Traffic Director is a managed Pilot (with extra capabilities) which will support Istio APIs for management. Thus, it should enable an easy opt-in replacement in case you want to replace your on-cluster, unmanaged pilot, with a fully managed one with high SLA. As far as i was told, there will be proper instructions of opting-in.

Traffic Director is a recent announcement by Google Cloud. As it is based on the core patterns of Istio, which Google is amongst its main contributors, I forecast a great future for it. It is in the time when all public cloud providers are announcing their own Mesh solution.

The Roadmap For Traffic Director Currently Includes:

  • Support Istio’s Security Features such as mTLS, RBAC (Istio RBAC)
  • Observability Integration
  • Hybrid and Multi Cloud Support
  • Management with Istio APIs
  • Anthos Integration (See my post on Anthos)
  • Federation with other service-mesh control planes

I hope that this post solves any confusion or questions, and if not contact me!

Iftach Schonbaum (Linkedin).

--

--