Early access to Kubernetes Alpha features on Azure using Cluster API

Jay Lee
Microsoft Azure
Published in
9 min readSep 15, 2023

--

Undoubtedly, Kubernetes is one of the biggest software projects in the history of IT, and this massively big project is moving like a bullet train that we have approximately three new versions every year. To put this into perspective, Let me throw some numbers here. Looking at the Kubernetes 1.28 project velocity, contributions were made from 911 companies and 1440 individuals in just a matter of 14 weeks (May 15 to August 15). In version 1.27, it brings a total of 60 enhancements and 18 out of them are completely new features.

With a continuous influx of new features, I always find something interesting like ContainerCheckpoin, SelfSubjectReview, etc. that I’d like to try. Azure Kubernetes Service is a great platform, but it doesn’t provide a way to test the new alpha features in the upstream Kubernetes. AKS as a managed platform, it makes a lot of sense to minimize the risk by allowing only safe and verified features for its users. The only way to try alpha features is to have a cluster that I can customize my own way, and ideally do it easily unlike the famous Kubernetes the Hard Way. This is the main motivation for writing this article to explain the other way to test the alpha features on Azure. There are a variety of different ways to stand up Kubernetes on Azure, but I’m going to use the open source called Cluster API which I personally use often.

NOTE: The scripts and YAML files used in this article are here.

What is Cluster API?

The official project describes Cluster API —

“Started by the Kubernetes Special Interest Group (SIG) Cluster Lifecycle, the Cluster API project uses Kubernetes-style APIs and patterns to automate cluster lifecycle management for platform operators. The supporting infrastructure, like virtual machines, networks, load balancers, and VPCs, as well as the Kubernetes cluster configuration are all defined in the same way that application developers operate deploying and managing their workloads. This enables consistent and repeatable cluster deployments across a wide variety of infrastructure environments.”

In short, Cluster API is built on the Kubernetes Operator pattern to provision and manage Kubernetes clusters on various environments like Azure, VMWare, etc. In case you’re using VMWare Tanzu Kubernetes or AKS on Azure Stack HCI, Cluster API(CAPI) is the very component under the hood.

Cluster API on various infrastructures

Cluster API is leveraging kubeadm to bootstrap Kubernetes on Azure, and the CNCF upstream team provides the base images for Cluster API for both Linux and Windows. You can see the full list of images published by the CNCF upstream team using the command below.

$ az vm image list --publisher cncf-upstream --offer capi --all -o table

Preparing Management Cluster

As mentioned above, ClusterAPI is a Kubernetes Operator that manages a set of Custom Resource Definitions (CRDs) designed to simplify the management of Target Clusters across various cloud platforms. Management Cluster is the cluster where we install the management plane of ClusterAPI. To bootstrap a management plane, ClusterAPI provides a handy CLI named clusterctl. You can find how to download it for the different OS here. This article is written using clusterctl 1.5.1.

$ clusterctl version
clusterctl version: &version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"db17cb237642881cde3e3f61e77eca55e2583883", GitTreeState:"clean", BuildDate:"", GoVersion:"go1.21.0", Compiler:"gc", Platform:"darwin/amd64"}

As ClusterAPI creates all the required resources like resource groups, VNET, subnet, VMs, etc. to provision Kubernetes on the cloud platform, it requires a system account that is powerful enough to do so. On Azure, we could use either Service Principal or Managed Identity. I will be using Service Principal and assigning Contributor as it’s the least permission required to create a resource group within a subscription.

$ az ad sp create-for-rbac --name clusterctlsp --role Contributor --scope "/subscriptions/{SUBSCRIPTION_ID}"

NOTE: By default, the identity used by the workload clusters is the same Service Principal assigned to the management cluster. clusterctl init installs an operator that requires the information of the Service Principal. Look at the script below.

$ cat ./init.sh
#!/bin/bash

source ./cred.sh

# Azure cloud settings
# To use the default public cloud, otherwise set to AzureChinaCloud|AzureGermanCloud|AzureUSGovernmentCloud
export AZURE_ENVIRONMENT="AzurePublicCloud"

export AZURE_SUBSCRIPTION_ID_B64="$(echo "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
export AZURE_TENANT_ID_B64="$(echo "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"

clusterctl init --infrastructure azure

cat <<EOF > azureidentity.yml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureClusterIdentity
metadata:
name: cluster-identity
spec:
type: ServicePrincipal
tenantID: "$AZURE_TENANT_ID"
clientID: "$AZURE_CLIENT_ID"
clientSecret: {"name":"cluster-identity-secret","namespace":"default"}
allowedNamespaces:
list:
- default
---
apiVersion: v1
kind: Secret
metadata:
name: cluster-identity-secret
type: Opaque
data:
clientSecret: $AZURE_CLIENT_SECRET_B64
EOF

kubectl apply -f azureidentity.yml

Running the shell script will initialize the management cluster on your current Kubernetes cluster. I’m using my own AKS for the control plane.

$ ./init.sh
Fetching providers
Skipping installing cert-manager as it is already installed
Installing Provider="cluster-api" Version="v1.5.1" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v1.5.1" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v1.5.1" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-azure" Version="v1.10.3" TargetNamespace="capz-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f -

It creates a few namespaces as seen on “TargetNamespace” in the output.

$ kubectl get ns
NAME STATUS AGE
...
capi-kubeadm-bootstrap-system Active 15d
capi-kubeadm-control-plane-system Active 15d
capi-system Active 15d
capz-system Active 15d
...

Two prefixes are there in the namespaces: capi and capz. capi refers to Cluster API, andcapz refers to Cluster API Provider for Azure. CAPI brings Kubernetes native cluster management, and CAPZ enables provisioning and managing Azure services. Before we move on, let’s check one thing quickly. If you look at what’s running inside capz-system namespace, there are nmi pods running together with CAPZ controller which tells CAPZ is still using Pod Identity. You can verify it by looking at the capz controller.

$ kubernetes describe po capz-controller-manager-855dc6746c-mr2rt -n capz-system
Name: capz-controller-manager-855dc6746c-mr2rt
Namespace: capz-system
Priority: 0
Service Account: capz-manager
Node: aks-userpool1-37271884-vmss000000/10.0.1.4
Start Time: Thu, 14 Sep 2023 00:23:35 +0800
Labels: aadpodidbinding=capz-controller-aadpodidentity-selector
...

Spinning up Kubernetes on Azure VMs using CAPI

CAPI, as a Kubernetes operator uses a bunch of CRDs. You can check with kubectl to see what those are.

$ kubectl api-resources | grep cluster.x-k8s
clusterresourcesetbindings addons.cluster.x-k8s.io/v1beta1 true ClusterResourceSetBinding
clusterresourcesets addons.cluster.x-k8s.io/v1beta1 true ClusterResourceSet
kubeadmconfigs bootstrap.cluster.x-k8s.io/v1beta1 true KubeadmConfig
kubeadmconfigtemplates bootstrap.cluster.x-k8s.io/v1beta1 true KubeadmConfigTemplate
clusterclasses cc cluster.x-k8s.io/v1beta1 true ClusterClass
clusters cl cluster.x-k8s.io/v1beta1 true Cluster
machinedeployments md cluster.x-k8s.io/v1beta1 true MachineDeployment
machinehealthchecks mhc,mhcs cluster.x-k8s.io/v1beta1 true MachineHealthCheck
machinepools mp cluster.x-k8s.io/v1beta1 true MachinePool
machines ma cluster.x-k8s.io/v1beta1 true Machine
machinesets ms cluster.x-k8s.io/v1beta1 true MachineSet
providers clusterctl.cluster.x-k8s.io/v1alpha3 true Provider
azureclusteridentities infrastructure.cluster.x-k8s.io/v1beta1 true AzureClusterIdentity
azureclusters infrastructure.cluster.x-k8s.io/v1beta1 true AzureCluster
azureclustertemplates infrastructure.cluster.x-k8s.io/v1beta1 true AzureClusterTemplate
azuremachinepoolmachines ampm infrastructure.cluster.x-k8s.io/v1beta1 true AzureMachinePoolMachine
azuremachinepools amp infrastructure.cluster.x-k8s.io/v1beta1 true AzureMachinePool
...

clusterctl has a subcommand generate which can generate the entire YAML file with CRDs to spin up the Kubernetes cluster. Here is an example of how to use it.

$ cat create_cluster.sh
#!/bin/bash

# Name of the Azure datacenter location.
export AZURE_LOCATION="eastus"

# Select VM types.
export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_D4_v3"
export AZURE_NODE_MACHINE_TYPE="Standard_D4_v3"

export AZURE_CLUSTER_IDENTITY_SECRET_NAME="cluster-identity-secret"
export CLUSTER_IDENTITY_NAME=${CLUSTER_IDENTITY_NAME:="cluster-identity"}
export AZURE_CLUSTER_IDENTITY_SECRET_NAMESPACE="default"

clusterctl generate cluster k8s1273 \
--kubernetes-version v1.27.2 \
--control-plane-machine-count=1 \
--worker-machine-count=2 \
--target-namespace=default \
> k8s1273.yaml

I’ve provided the machine type for the control plane, the worker node, and the count for each node. The version is set to 1.27.2. This script generates the full YAML file with CRDs including Cluster, AzureCluster, KubeadmControlPlane, AzureMachineTemplate, etc. Creating a cluster is only a single command away which is kubectl apply.

$ k apply -f k8s1273.yaml
cluster.cluster.x-k8s.io/k8s1273 created
azurecluster.infrastructure.cluster.x-k8s.io/k8s1273 created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/k8s1273-control-plane created
azuremachinetemplate.infrastructure.cluster.x-k8s.io/k8s1273-control-plane created
machinedeployment.cluster.x-k8s.io/k8s1273-md-0 created
azuremachinetemplate.infrastructure.cluster.x-k8s.io/k8s1273-md-0 created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/k8s1273-md-0 created
azureclusteridentity.infrastructure.cluster.x-k8s.io/cluster-identity created
$ kubectl get cluster
NAME PHASE AGE VERSION
k8s1273 Provisioned 1h

After provisioning the cluster, Go to the Azure portal and check the newly created resource group.

CAPZ created the necessary resources on Azure

The workload cluster stood up by CAPZ is barebone Kubernetes that lacks the basic components like CNI and Cloud Provider, so it’s not initialized yet to run any pods. We will add CNI and Cloud Provider to make them usable. Let’s get the kubeconfig first. clusterctl provides a subcommand to generate it for us to access the cluster. I’ve created a simple script to generate and merge with existing kubeconfig.

$ cat merge_kubeconfig.sh
#!/bin/bash

clusterctl get kubeconfig $1 > config
export KUBECONFIG=~/.kube/config:./config
kubectl config view --flatten > kubeconfig.yaml
mv kubeconfig.yaml ~/.kube/config

As CNI is missed, all the nodes are in NotReady status.

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s1273-control-plane-xwjhc NotReady control-plane 8h v1.27.2
k8s1273-md-0-p95vg NotReady <none> 8h v1.27.2
k8s1273-md-0-vjntw NotReady <none> 8h v1.27.2

I will use Calico in the article, but many other choices like Flannel, Cilium, etc., and even Azure CNI exist.

$ helm repo add projectcalico https://docs.tigera.io/calico/charts
$ helm install calico projectcalico/tigera-operator --version v3.26.1 -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/main/templates/addons/calico/values.yaml --set-string "installation.calicoNetwork.ipPools[0].cidr=192.168.0.0/16" --namespace tigera-operator --create-namespace

Check the node status again if the status becomes Ready.

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s1273-control-plane-xwjhc Ready control-plane 8h v1.27.2
k8s1273-md-0-p95vg Ready <none> 8h v1.27.2
k8s1273-md-0-vjntw Ready <none> 8h v1.27.2

Installing external(out of tree) cloud provider is also easy using Helm chart.

$ helm install --repo https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo cloud-provider-azure --generate-name --set infra.clusterName=k8s1273 --set "cloudControllerManager.clusterCIDR=192.168.0.0/16"

Once cloud providers are fully up and running, nodes will become ready. You can remove this cluster now by simply running kubectl delete -f k8s1273.yml, and This is the main reason why I like CAPI over other choices as I can use an operator to manage Kubernetes clusters.

Enabling Alpha feature with CAPI

Kubernetes has a feature toggle called feature-gate, which can toggle the various experimental alpha/beta features of Kubernetes. The full status of alpha/beta features can be found below.

There is a massive list of features, but I’m going to enable ValidatingAdmissionPolicy which I’m particularly interested in. Here is the introduction of the Validation Admission Policy from Kubernetes documentation.

Validating admission policies offer a declarative, in-process alternative to validating admission webhooks.

Validating admission policies use the Common Expression Language (CEL) to declare the validation rules of a policy. Validation admission policies are highly configurable, enabling policy authors to define policies that can be parameterized and scoped to resources as needed by cluster administrators.

Sample policy will give you a clear idea of how this would be helpful for production users. The example below is to mandate the minimum replica more than 3 for the production system which should ensure the high availability of pods.

$ cat admissionpolicy.yml
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicy
metadata:
name: "force-ha-in-prod"
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["apps"]
apiVersions: ["v1"]
operations: ["CREATE", "UPDATE"]
resources: ["deployments"]
validations:
- expression: "object.spec.replicas >= 3"
message: "All production deployments should be HA with at least three replicas"
---
apiVersion: admissionregistration.k8s.io/v1alpha1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: "force-ha-in-prod-binding"
spec:
policyName: "force-ha-in-prod"
validationActions: [Deny]
matchResources:
namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: default

To toggle this feature, two things should be done according to the official documentation.

  1. Ensure the ValidatingAdmissionPolicy feature gate is enabled.
  2. Ensure that the admissionregistration.k8s.io/v1beta1 API is enabled.

You can open the previous workload cluster template k8s1273.yml and add the feature gates and runtime configuration to enable it.

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
name: k8s1273-control-plane
namespace: default
spec:
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
extraArgs:
cloud-provider: external
feature-gates: "ContainerCheckpoint=true,ValidatingAdmissionPolicy=true"
runtime-config: "admissionregistration.k8s.io/v1alpha1=true"
timeoutForControlPlane: 20m
controllerManager:
extraArgs:
allocate-node-cidrs: "false"
cloud-provider: external
cluster-name: k8s1273
feature-gates: "ContainerCheckpoint=true"
etcd:
local:
dataDir: /var/lib/etcddisk/etcd
extraArgs:
quota-backend-bytes: "8589934592"

Once the cluster is up and post-installation steps are done, apply the ValidatingAdmissionPolicy, ValidatingAdmissionPolicyBinding and create a Deployment with less than three replicas. It will fail with an error as we set in the policy.

$ kubernetes apply -f whereami.yml
The deployments "whereami" is invalid: : ValidatingAdmissionPolicy 'force-ha-in-prod' with binding 'force-ha-in-prod-binding' denied request: All production deployments should be HA with at least three replicas

Wrapping Up

I personally have been using Cluster API quite often for the test and demo, and I can confidently recommend using it if you need to provision vanilla Kubernetes on Azure for cases when AKS can’t be an option. The lack of federated identity support is the only caveat at the moment, but work is already in progress so it will land sometime soon.

If you liked my article, please leave a few claps or start following me. You can get notified whenever I publish something new. Let’s stay connected on Linkedin, too! Thank you so much for reading!

--

--

Jay Lee
Microsoft Azure

Cloud Native Enthusiast. Java, Spring, Python, Golang, Kubernetes.