Adopting Istio in Your Kubernetes Clusters
Istio gets a lot of buzz these days. The service mesh platform recently hit a 1.0 ready-for-production milestone. Google has been teasing a managed Istio option on Google Cloud. There’s plenty of resources for getting it running within a Kubernetes cluster.
You’ve probably progressed through the various stages of the tech hype cycle:
- What even is a service mesh?
- Wow. A service mesh is kinda complicated. Do I really need one?
- Ok. I can see some use-cases. How many otherwise productive days will it take me to run through this “Istio Up & Running in 15 Minutes” demo?
And now you’ve arrived at actually setting up Istio for your existing Kubernetes managed applications. That’s going to take quite a bit longer than 15 minutes. This post will give you a picture of what you’ll need to do to introduce Istio into your Kubernetes clusters.
What even is an Istio?
If nothing else you’ve probably heard that Istio is a service mesh. “Service mesh” is a fancy term for tooling that handles common communication challenges between a collection of connected services.
In real world terms, Istio’s features build on the power of Kubernetes adding:
- Mutual TLS for identification, authorization, and encrypted communication between services.
- Outbound traffic restriction with selective whitelisting.
- Dynamic traffic distribution patterns such as concurrent application versions and gradual canary-style roll-outs.
- Improved resiliency with circuit breakers, retry handling, fail over, and support for “Chaos Monkey” style fault injection testing.
- A ton of additional metrics that illuminate communication patterns and performance.
Istio accomplishes all of this by running an individual proxy sidecar container inside each of your pods. A set of core services run in your cluster and communicate with these proxy sidecars to enable the features described above.
Computer: Install me an Istio
The preferred Istio installation method on Kubernetes is a Helm chart. Note that the Istio Helm chart is embedded in the Istio Github repo. It is not available through the official community Helm Charts repo.
The chart installs a collection of core services required for Istio to function. It provides for extensive customization through chart values. It also supports upgrading to new Istio versions going forward.
Before you get started installing Istio though read on to learn some of the sharp edges that are specific to installing and running on Kubernetes.
How do you want to install sidecar proxies in all your pods?
All of your pods will require a sidecar proxy container. Istio can inject this proxy container automatically into new pods without any changes to deployments. This feature is enabled by default but it requires Kubernetes version 1.9 or later.
You will need to label namespaces to enable sidecar injection:
$ kubectl create namespace my-app
$ kubectl label namespace my-app istio-injection=enabled
Existing pods will not be impacted by Istio chart installation. Pods launched in labelled namespaces will receive the sidecar while those launched in other namespaces will not. This provides a path to install Istio’s core services and selectively transition pods into the mesh.
How will you transition services to mutual TLS?
Mutual TLS is an attractive security feature because it limits which services can communicate with each other and it encrypts the communication. It can prove to be a stumbling block though while you are transitioning services and they need to communicate with resources outside the mesh.
To ease this transition, consider enabling a PERMISSIVE mode authentication policy. Permissive mode enables both plaintext HTTP and mTLS traffic for a service. The policy can be applied cluster-wide in a mesh policy, per namespace using a default policy, or using more granular policies to target individual services. Once your applications have been fully migrated over to Istio, you can disable permissive mode and enforce mutual TLS.
Another thing to note is that typical Kubernetes HTTP liveness and readiness probes will not work in mTLS only mode because the probes come from outside the service mesh. These checks will work with PERMISSIVE mode enabled. Once you remove PERMISSIVE mode you will need to either convert the probes to EXEC checks that poll the health check inside the pod or establish health check endpoints on a separate port with mTLS disabled.
Here’s how you can create a mesh policy that enables mTLS in permissive mode for the entire cluster to get started:
cat <<EOF | kubectl apply -f -
Do you want to start with egress filtering enabled?
By default Istio restricts outbound traffic from your pods. This means if your pods communicate with a service external to the mesh like a cloud provider API, third party API, or managed database that traffic will be blocked. You can then selectively enable access to external services using ServiceEntries.
On its face, egress filtering looks like an attractive security feature. However, the way it is implemented provides no real security benefits. As explained in an Istio article, there is no mechanism that ensures the IP that is filtered matches the name. This means that the filtering can be bypassed by simply setting a host header in your HTTP request.
In addition, egress filtering can be difficult as a starting point for existing applications. Alternatively you can disable egress filtering at the cluster level using the global includeIPRanges setting. By setting this to the internal ip space of your cluster you will bypass the Istio proxy for all external traffic. Once you have services running in Istio you can identify external services and build up the ServiceEntries needed before turning on egress filtering.
The following Helm install will disable egress filtering entirely for all pods. You should add your internal cluster IP CIDR to the includeIPRanges setting to route traffic from pods to internal resources through Istio while bypassing it entirely for external endpoints. With either of these configurations you will not need to define any ServiceEntries for accessing external endpoints.
$ helm upgrade install/kubernetes/helm/istio — install \
--name istio — namespace istio-system \
--set global.tag=”1.1.0.snapshot.1" \
Istio is not fully Kubernetes native by design
There’s no doubt, Istio provides a lot of resources for deployment on Kubernetes. There is a Helm chart with a slate of subcharts for installing numerous services with related CRDs. There is plenty of documentation and example chart configurations. Still, Istio aims to be platform independent. With that in mind there are a few stark differences involved with deploying an app with Istio support on Kubernetes.
One of the most impactful differences lies in exposing services outside the Istio mesh. A typical Kubernetes application exposes an external interface using an Ingress tied to an ingress controller. Istio instead relies on a Gateway object to define protocol settings such as port and TLS. Gateway objects are paired with Virtual Service objects to control routing details.
This likely impacts patterns you may already have established for Ingress objects such as domain name and TLS certificate management. The external-dns project recently introduced support for Istio Gateway objects making transition from Ingress objects much easier for automatic domain management.
The TLS certificate story for Gateways is still somewhat complicated. Istio’s IngressGateway does not support multiple certificates in a way that is compatible with cert-manager, a popular way to automatically provision and install TLS certificates from Let’s Encrypt in Kubernetes. Certificate updates require rolling the IngressGateway pods which further complicates things. This makes using short lived Let’s Encrypt certificates difficult without manually rolling pods on a schedule. Even with static certificates, the gateway mounts all of the certificates at start time, so adding and removing certificates for additional services also requires a restart. All of the services sharing a gateway will have access to all of the other services’ certificates as well.
Currently the easiest options to get started with TLS for external services under a Gateway are:
- Terminate TLS outside the mesh at something like an Amazon Elastic Load Balancer and let Amazon’s Cert Manager manage certificates.
- Avoid dynamic certificate providers like Let’s Encrypt and install a multi-domain or wildcard TLS certificate on the Istio IngressGateway.
Find your path to production grade Istio on Kubernetes
Now that you’re aware of some of Istio’s sharp edges on Kubernetes you can move forward installing Istio in your Kubernetes clusters. This post is far from offering production grade configuration. It will however get you going with minimal disruption so that you can discover your own path to a working production configuration.