Episode-III + ½ What a Mesh!

Fatih Nar
Open 5G HyperCore
Published in
7 min readSep 17, 2021

Authors: Ingo Meirick Master Systems Designer @ Ericsson, Doug Smith Principal SW Engineer @ Red Hat, Fatih Nar Chief Architect @ Red Hat.

Preface

In the previous episode (Episode III Meshville) we have presented some of the shortcomings of istio based service mesh and offered some thoughts/prayers for the way forward. While CNCF Network Plumping WG is working on possible solutions , we have faced a new challenge about how the sidecar container get plugged into the pod; init-container vs istio-cni. In this episode we will cover that area as much as we can.

1.0 Introduction

Istio has reached 1.11.2 recently and still by default it injects an init container, istio-init in particular, in pods deployed with the service mesh. The istio-init container sets up the pod network traffic redirection to/from the Istio sidecar proxy (all tcp traffic). This requires the user or service-account deploying pods to the mesh to have elevated level Kubernetes (k8s) role based access control (RBAC) permissions. Requiring tenants to have elevated k8s RBAC permissions could be problematic for some organizations’ security compliance and as a matter of fact could likely to be a deal breaker for certain industry security certifications.

Figure-1 Pod network namespace initialization options

The Istio container network interface (CNI) plugin is a replacement for the istio-init container that performs the same networking functionality but without requiring k8s tenants to have elevated Kubernetes RBAC permissions.

In this article we will delve into details of each approach; istio-init versus istio-cni and identify possible pros / cons and articulate a possible recommendation.

2.0 Overview

The Init container is a dedicated container that runs before the application container is launched and it is used to contain some utilities and/or installation scripts that do not exist in the main application container image. Multiple init containers can be specified in a pod, and if more than one is specified, the init containers will run sequentially. The next init container can only be run if the previous init container must run successfully. Kubernetes only initializes the Pod and runs the application container when all the init containers have been run.

Figure-2 podrama

During pod startup, the init container starts sequentially after the network and data volumes are initialized, each container must be successfully exited before the next container can be started, if exiting due to an error will result in a container startup failure, it will retry according to the policy specified in the pod’s restartPolicy. The pod will not become ready until all init containers are successful and the init container will automatically terminate once it is run.

Istio is implemented (by default) with an injected initContainer called istio-init to create iptables rules before the other containers in the pod can start. This requires the tenant (user), or service-account that’s deploying pods in the mesh, to have sufficient privileges to deploy containers with the CAP_NET_ADMIN, CAP_NET_RAW capabilities. Capabilities are a linux kernel feature that provides more granularity for permissions than traditional UNIX-like permissions structure — privileged and unprivileged. In general, we want to avoid giving application pods capabilities that are not required by workloads (the applications running within pods). Specifically, CAP_NET_ADMIN provides the ability to configure interfaces, IP firewall rules, masquerading, setting promiscuous mode, among other permissions. For a deeper look at capabilities, pull up the man page with “man capabilities” at your closest convenient linux terminal. In some scenarios, by allowing these capabilities, one could expose container escape situations that may end up compromising the node and later the whole cluster. One way of mitigating this issue is to take the responsibility of iptables rules configuration out of the pod itself, by using a CNI plugin.

CNI (Container Network Interface), a Cloud Native Computing Foundation project, consists of a specification and libraries for writing plugins to configure network interfaces in Linux containers, along with a number of supported network plugins. CNI concerns itself only with network connectivity of containers and removing allocated resources when the container is deleted.

Figure-3 ISTIO-CNI Detailed Topology

The Istio CNI plugin replaces the istio-init container, which provides the same functionality of configuring iptables in the pods’ network namespace, but without requiring the pod to have the CAP_NET_ADMIN capability. Istio requires setting up traffic redirection in order to properly capture network data, in both the istio-cni and init container modes. Istio needs this to be performed in the setup phase of the Kubernetes pod’s lifecycle, to be ready when the workloads start. CNI plugins run after pod sandboxes are created, and typically run as binaries on the host system, or are performed by a separate process from a given workload (see the Istio CNI DaemonSet in Figure-3), which allows the segregation of this responsibility from a workload into a separate dedicated process. This means that the pod no longer needs to have any additional capabilities in order for Istio to function, reducing the exposed surfaces that provide for security concerns.

Istio can be leveraged in a way that service mesh is actually a platform capability with multi-tenancy which essentially lets istio platform to serve across multiple tenant namespaces in controlled manner without sacrificing pod level configurations to enable/disable. As platform capability it is implemented and ready to be used across the cluster and maintained part of the platform life cycle management. Each node in the cluster would have capability to accommodate tenant workloads with seamless service mesh availability via daemon sets.

Figure-4 Istio-CNI Node Daemons (Test Cluster: 3 Masters + 3 Workers)

When using istio-cni in concert with Multus CNI (a CNI meta plugin used for attaching multiple networks to pods), istio-cni is configured to run using a NetworkAttachmentDefinition (net-attach-def) (a de-facto standard for multiple network attachments for pods).

Istio functionality can be enrolled/enabled per tenant workload namespace (i.e. service mesh can be added/removed at tenant namespace level) and be used by pods if they are marked with sidecar.istio.io/inject: “true” annotation (i.e. service mesh can also be enabled/disabled at tenant workload level) in pod deployment manifest.

Figure-5 net-attach-def per namespace

Despite the tenant namespace enrolled in service mesh , some of the workloads within that particular namespace may not desire to be enrolled , -for example; to avoid the latency tax of the istio sidecar. To handle this, Istio implements a pod deployment annotation selector (which is to be replaced by label selector in coming releases) in place to offer such a control.

Figure-6 pod deployment manifest with multus and istio

3.0 Key Differences

Table-1 Comparison Summary Table
  • Security: Istio-init container requires CAP_NET_ADMIN and CAP_NET_RAW capabilities in order to alter iptables rules on the pod network namespace in which it resides. These capabilities expose a large concern for container escape security exposures. Istio-cni performs iptables rule alterations via daemon set (see figure-4) deployed & maintained by cluster administrators. For details on risk of using privileged containers please visit NIST Special Publication 800–190 Application Container Security Guide especially section 3.4.3.
  • Life Cycle Management (LCM): Upgrading of platform (k8s) separate than add-on capabilities (istio) has been very problematic (through out various istio releases we have observed multiple times of breaking the backward compatibility), instead building a homogeneous application platform with capabilities embedded inside is what enterprises are seeking for better LCM and support-ability. Istio-init container yet depends on another container image which by itself is another oam duty, as well as security attack surface to worry about. Also the istio-init container way seems to be getting obsolete as more distro vendors and k8s software as a service (SaaS) providers (such as AWS, Azure, GCP etc) are moving into cni approach.
  • Extensibility: CNI is a vendor agnostic constructive approach to implementing and maintaining container orchestration engines (such as k8s), hence it offers workload portability across different distros and k8s SaaS platforms. Also the surrounding ecosystem for service mesh for traces (zipkin, jaeger), log query (elastic search), visualisation (kiali) etc needs integration into the service mesh which needs to be done at platform level by platform administrators.
  • Complexity: In order to have a common way of doing things cni way that looks more complex, however, complications of using the init-container methodology for setting network rules brings irregularity and inconsistency across different distros and SaaS platforms.
  • Multi-vendor: As mentioned above (LCM bullet above) more and more distros and SaaS platforms are moving towards cni way in order to offer multiple network interfaces (multus) for the tenant workloads while doing so having istio-cni under network implementation brings consistency for the software stack.

4.0 Summary

Service mesh is a rapidly evolving area, and within this area networking has not been well addressed as it is highly environment specific and subject to manual interventions to make it work. We believe that many mesh deployment mechanisms/installers will seek to solve the same problem of making the network layer pluggable and consistent. To avoid duplication, we think it is prudent to define a common interface between the existing/utilised network plugins and service mesh which is cni.

In particular with istio mesh deployment/use; in order to accomplish the most secure and abstracted way of implementing service mesh- guarantee that control plane components use the fewest privileges possible is istio-cni, therefore we kindly recommend use of Istio CNI plugin.

--

--