Accelerating Open Source: Pluggable Clouds in Kubernetes (Part 2)

Sidhartha Mani
4 min readJun 21, 2017

--

This is part II of a series of blog posts about making the cloud provider in Kubernetes pluggable. This blog post covers how the kubelet was made pluggable with respect to the cloud provider. For kube-controller-manager, refer to my previous blog post in this series. Also stay tuned for my next blog post, where I will discuss the new architecture of kube-apiserver.

In the previous blog post, I discussed my motivations behind making the cloud providers in Kubernetes pluggable. In this post, I’ll dive into how one specific component, the kubelet, was made pluggable.

When I envision the Kubelet, I see it as analogous to engineers in an organization — who take direction from Managers and write the code to make things happen. Kubelet does exactly this — it runs on every node of the Kubernetes cluster and performs almost all of the tasks on the node, which are directed by the kube-controller-manager and the kube-apiserver. As we will be talking about how the kubelet was made pluggable, let’s begin by understanding how the kubelet and cloud interact.

Kubelet and Cloud

Kubelet is responsible for almost every action performed on the nodes managed by Kubernetes. Kubelet registers nodes, queries for information, starts pods, mounts volumes, monitors health, removes volumes, deletes pods, sets up the network, and a lot more.

Here’s a diagram depicting the cloud provider integration points of Kubernetes.

Cloud Controller Manager and Kubelet cloud integration points

Kubelet and Kube-proxy are the only two “non-master” components. Since they run locally on a node, they have access to node level resources. These resources are otherwise inaccessible from the master components (Kube-apiserver, kube-controller-manager etc.). For instance, Kubelet has access to a node’s local metadata service, whereas other components do not.

Since the Cloud Controller Manager (or CCM) runs as a master service, it doesn’t have access to the node’s local metadata service either. This property is important and interesting because almost all of the cloud provider operations performed by Kubelet rely on the local metadata service.

All of the Kubelet’s Cloud provider interactions only obtain information about the node on which it is running on. It queries the cloud for three pieces of information.

  1. Obtain information about the instance type of the node that it is running on.
  2. Obtain the node’s zone and region.

The above two calls are done during node registration.

After the node registration is successful, the kubelet starts a periodical loop to obtain the third piece of information

3. The node’s external IP addresses from the cloud provider.

This is done because in clouds like AWS, external addresses cannot be obtained by querying the OS ( `ip addr` will not show the external IP addresses). It is done periodically because external addresses can change after node restarts.

Pluggable Kubelet

In order to move these three steps into the cloud-controller-manager, there were some considerations to take into account. It was not straightforward because the CCM could not access metadata or other local services, and secondly the cloud provider calls are done at the early node creation stage. Since the CCM can only process the nodes after creation, the Kubelet created nodes without the essential instance type and zone information should be marked unusable, until the CCM processes it.

In order to address the first challenge, we needed a mechanism to address nodes uniquely from a remote service. This would make sure that the CCM would still be able to query for node specific information and address nodes uniquely.

We decided that the best approach to do this was by using the node’s provider ID. The provider ID allowed nodes to remotely identified and therefore their information could be accessed in cloud databases without needing access to the node itself.

Note: Clouds like OpenStack can only obtain node information from the local metadata service, so we added a new flag to the kubelet called — provider-id that can be used to pass in provider id obtained from the node. This passed in provider id would then be used by CCM.

In case of the second challenge, we needed a way to inform the kube-apiserver that the node which has registered is incomplete. If this could be achieved, then pods won’t be scheduled on a node that isn’t ready.

We addressed this problem by adding a taint on the node whenever node is created. This taint informs the kube-apiserver that the node is not ready for pods. The CCM would process this node after seeing the taint and then remove the taint once the correct information is added.

These two architectural decisions, namely the node taint and using provider ID to address nodes, instead of relying on metadata service is what made the Kubelet cloud provider pluggable.

Now that we’ve made the Kubelet pluggable, this leaves the cloud provider integration points in the kube-apiserver.

Stay tuned for my next blog post on making the kube-apiserver pluggable!

--

--