Kubernetes storage on Azure (Part 1)

Kubernetes storage basics and Azure disk CSI Driver

Krishnakumar R
Microsoft Azure
8 min readSep 19, 2020

--

In this part of Kubernetes storage on Azure — we explore basic Kubernetes storage concepts and how to use Azure managed disks on Kubernetes. Then we delve into the internals of various components of the CSI ecosystem on Kubernetes and the Azure disk driver implementation.

Introduction

Azure supports various forms of storage — disks, file share, blob storage etc. In the first two parts of this series we look at how Kubernetes on Azure interacts with disks in a Linux environment and the code behind the scenes. In part 3 we will look at Azure File in the context of Kubernetes. We conclude the series in part 4 by looking at the internals of how all this works in the Windows environment.

Kubernetes storage basics

At a high level there are few concepts we need to internalize in order to use disks (and in general storage) in Kubernetes — PVC, PV, StorageClass, volumes and volumeMounts. Persistent Volume Claim (PVC) is a generic object which represents the storage request. The Persistent Volume(PV) object represents the underlying storage entity. PV objects can be dynamically provisioned by means of a PVC or statically provisioned by explicitly creating a PV object. The StorageClass object specifies the properties such as storage driver to be used, the type of storage etc. The PVCs which containers need access to can be indicated in the pod specification under the volumes field and per container choices specified using the volumeMounts key. More details can be found in Kubernetes persistent volume documentation.

A typical user workflow utilizing disks on Kubernetes involves creating PVC and associating that with the containers in a pod. Here is an example PVC spec used to create disks on Azure:

The above can be used to create the PVC, with the size specified. PVC is a generic object in kubernetes. The connection between this generic object and properties of the corresponding storage is defined in another object called StorageClass. The storageClassName connects the PVC to the storage class. In the example above, we refer to managed-csi storage class. Let's have a look at that:

StorageClass defines the type of storage which needs to be created in skuname within the parameters. In this case we are requesting a standard SSD disk. The provisioner field specifies which driver will be used to provision the underlying storage. Kubernetes storage drivers perform all the control plane operations required to make storage available to the pods. For a long time Kubernetes storage drivers were part of core Kubernetes code. With CSI(Container Storage Interface) implementation on Kubernetes becoming GA in 1.13, a storage/cloud provider can release their drivers independent of the core kubernetes code churn. Here we are referring to the azure csi driver in the provisioner field.

Let’s go through the spec of the Pod, which uses the PVC. The volumes field specifies the PVCs accessed by the containers in this Pod. We then define which containers and where in their local file system they can access the storage under the volumeMounts field.

Now let’s look at how Azure disk can be used by creating the objects above.

Using Azure disk

Before we start with creating PVC, lets install the CSI driver. Azure CSI disk driver can be installed by following instructions from the azuredisk-csi-driver installation link.

After this let’s deploy the StorageClass and the PVC.

Let’s look at the underlying PV which gets created. This PV represents the disk in the cloud.

Next we deploy the Pod which uses the storage.

Once the pod has been created, let’s check out the location where the disk can be accessed from the container.

In the exec session we can see a file named outfile generated by the command (ref: command field in the Pod spec) running in the container indicating that the disk created by the PVC is accessible within the container.

Now that we know how to use Azure disk on Kubernetes, let us explore the internals of Azure disk CSI driver implementation.

CSI driver implementation

There are several pieces of CSI implementation on Kubernetes which comes together to provide access to storage for applications. In the context of Azure disk driver, we consider the details of two essential components — controller plugin and node plugin.

At a high level the controller plugin is responsible for the interactions with the cloud (in this case Azure) — like disk creation and attach to the VM. Node plugin is responsible for the operations on the nodes such as formatting, mounting etc.

Deployment

Let’s take an example of aks-engine cluster where the CSI drivers have been deployed. You can find the CSI controller plugin deployed in the kube-system namespace in a leader election configuration with one active and one passive instance as follows:

The node controller plugins are deployed as daemonsets

In the cluster I was using, there is a single management node and three worker nodes, so there are 4 instances of the linux node plugin drivers(one on each node).

Controller plugin

CSI controller plugin is responsible for the creation and deletion of disks in the cloud. It is also responsible for attaching (when pods referring to the disk are run) and detaching the disks from the VMs (when the pods are removed). The following core CSI RPC calls are implemented by the controller plugin:

For brevity, we will only go into details of the above calls even though the controller plugin also implements other calls for snapshot, volume resizing etc.

The kube controller manager runs controllers which invoke the controller plugin related operations. The PersistentVolumeController runs a sync loop to ensure that PVCs have a backing PV created. If not, it adds the volume.beta.kubernetes.io/storage-provisioner annotation to the PVC. The csi provisioner looks for this annotation to identify that a PV has to be created. The provisioner then invokes CreateVolume grpc call to the vendor specific code of the controller plugin (in this case azure disk controller). The azure disk controller then invokes the calls to create the disk. We will get into more details of the Azure specific implementation in Part 2 of this series.

When a pod is deployed with the PVC specified in volumes, the attach/detach controller creates a VolumeAttachment object which connects the node and the PV associated with it. The csi attacher code detects a change in the VolumeAttachment object and sends a ControllerPublishVolume grpc to the azure disk controller code. Code to perform Attach/Detach is run as a result of the ControllerPublishVolume call. We will go into details of the implementation in the Part 2 of this series.

Node plugin

Node plugin for the storage provider (e.g. azure disk csi driver) registers with the kubelet via the node driver registrar container packaged within the node plugin pod. Kubelets on individual nodes in the cluster watch for pods landing on them and perform operations required to make associated PVs available to them. The following core CSI calls are implemented by the node plugin:

There are other calls like getting the capabilities, expanding the volume etc. which are also implemented by this plugin, but we only go into the details of the above calls.

The Volume manager invokes the node plugin to perform the NodeStage operation. The node plugin identifies the device under /dev (e.g. /dev/sdc) in the VM by referring to the lun number used by the controller plugin while attaching. This device is then formatted (in case of first time use) and mounted on a global mount path. Here is an example location on the VM where the global mount is performed to:

Then during the NodePublishVolume call the global mount path is bind mounted to the pod specific directory, such as:

Kubelet then communicates with the container runtime to pass on the location to mount on the container.

Project details

The code for Azure disk CSI driver resides in azuredisk-csi-driver github repo. The Azure disk CSI driver development guide has details on how to get started on development. Currently the project is in beta stage and features supported can be found here.

Acknowledgement

A big thanks to Andy, Chitkala, Balaji, Anish and Sundeep for reviews and valuable feedback provided on earlier drafts of this article.

What next ?

In this part of Kubernetes storage on Azure we explored how to use Azure disk with Kubernetes, how various CSI components work together to provision and use disks and also delved into Azure disk CSI driver internals. In Part2 of this series we will dig deeper into the code and concepts of what goes on behind the scenes.

Krishnakumar is a Senior Software engineer in the Azure Data team. Follow him on Twitter at https://twitter.com/kkwriting .

Originally published at http://kkwriting.com on September 19, 2020.

--

--