Intel® Clear Containers and CRI-O*

Published in

cri-o

5 min readAug 30, 2017

In this post, I’ll provide some background on the Clear Containers project and describe how Clear Containers works with CRI-O and Kubernetes*. Then, I’ll walk you through creating a Kubernetes configuration which uses a mix of runc and Clear Containers to secure workloads with varying levels of trust.

Introducing: Clear Containers

The Clear Containers project looks to bridge the gap between traditional virtual machine security and the lightweight benefits of traditional Linux containers. Traditional containers use Linux* control groups, cgroups, for managing and allocating resources and namespaces to provide container isolation. Security isolation is provided by using different security mechanism provided by the host kernel, dropping linux capabilities, readonly mountpoints, MAC security measures like SELinux and AppArmor, dropping syscalls using SECCOMP, etc. But, as shown below, each of these containers share the same underlying Linux kernel.

In a multi-tenant environment where workloads are running with unknown levels of trust, significant efforts are required to ensure a secure environment. Protecting against security breaches in these environments were one motivating factor for creating Clear Containers.

Clear Containers have an OCI compatible runtime, cc-runtime, which launches an Intel® VT-x secured hypervisor to provide container isolation.

For Clear Containers, each container is booted as a lightweight virtual machine with its own unique kernel instance. Since each container is now running with its own VM, they no longer gain access to the host kernel and get the full security benefits of a virtual machine.

One performance enhancing feature is the use of KSM, a recent KVM optimized for memory sharing and boot speed. Another is the use of an optimized Clear Containers mini-OS. The Clear Linux kernel and Clear Containers userspace are features of that mini-OS for container boot performance.

Since Clear Containers is OCI compatible, changing your local Docker* to use cc-runtime in addition to runc is as simple as:

dockerd — add-runtime cc-runtime=/usr/bin/cc-runtime — default-runtime=cc-runtime

This will allow the user to choose the appropriate docker runtime based on the type of workload hosted in the container.

How CRI-O changed Clear Containers with Kubernetes

Kubernetes, a project originating from Google and now hosted by CNCF with collaborators across many companies, is the dominant container orchestration engine. Kubernetes clusters run pods, with all the containers of that pod sharing resources: networking, storage, etc. All pods within a cluster have their own IP address.

High level overview of Kubernetes with Docker being used in the Kubernetes node.

By default, Kubernetes uses Docker to start pods and containers within a pod. A Docker controlled Clear Container will start one VM per container. Providing the Kubernetes pod semantics with one VM per container is very challenging, especially from a networking standpoint.

The recent addition of CRI to Kubernetes means Clear Containers can be controlled by any OCI compatible CRI implementation, CRI-O being the main one. Clear Containers can now receive container annotations to let it know when and how to run pod VMs or container workloads within those pods. In Kubernetes clusters with CRI-O and cc-runtime as the default container runtime, the launch of a pod results in the creation of a VM. Then, when a container is added to that pod, it is launched as a container inside the pod’s VM.

A view of a Kubernetes node which is using CRI-O and runc to create a pod with three containers.

A view of a Kubernetes node which is using CRI-O and Clear Containers to create a pod with three containers.

The CRI-O project supports the ability to provide a secondary runtime to handle untrusted workloads. In CRI-O this is called the untrusted-runtime. This means, in an environment with workloads of various levels of trust, CRI-O allows your Kubernetes cluster to be composed of a mix of runc and cc-runtime based pods. Note, if this `untrusted runtime` is not provided in the CRI-O configuration, then all workloads will make use of the trusted runtime, which defaults to runc.

Two trust levels… two runtimes… one cluster

Depending on your configuration, you can set a default trust level for workloads to be either trusted or untrusted. When the default workload type is set to trusted, any workload launched in the cluster will be launched in a runc container unless explicitly declared untrusted. To mark a workload as untrusted, you must make use of Kubernetes annotations. The following setting notes this:

io.kubernetes.cri-o.TrustedSandbox: “false”

When the default workload type is set to untrusted, the provided untrusted runtime in the CRI-O configuration will be used for all non-privileged containers regardless of the value of io.kubernetes.cri-o.TrustedSandbox.

This rule ensures all workloads can be run using Clear Containers without any changes to default payload definitions. The result could be running all non-infrastructure pods in Clear Containers with relative ease.

In the event that an untrusted runtime is not defined when configuring CRI-O, all containers will fall back to the trusted runtime, which is configured by default as runc.

To see a Kubernetes cluster using two runtimes depending on trust level, watch the demonstration below. The screencast shows a Kubernetes cluster brought up with CRI-O and configured with runc for trusted workloads and cc-runtime for untrusted workloads. The default workload type is trusted, so workloads will use runc unless they are marked as untrusted.