Cilium: Installing Cilium in AKS (BYOCNI) with no Kube-Proxy

Amit Gupta
9 min readAug 3, 2023

--

Source: docs.cilium.io

☸ ️Introduction

kube-proxy is a component of Kubernetes that handles routing traffic for services within the cluster. There are two backends available for Layer 3/4 load balancing in upstream kube-proxy - iptables and IPVS.

Need for Kube-Proxy:

IPtables and Netfilter are the two foundational technologies of kube-proxy for implementing the Service abstraction. They carry legacy accumulated over 20 years of development grounded in more traditional networking environments that are typically far more static than your average Kubernetes cluster. In the age of cloud native, they are no longer the best tool for the job, especially in terms of performance, reliability, scalability, and operations.

Cilium to the rescue

Cilium’s kube-proxy replacement offers advanced configuration modes to cater to your specific needs. Features like client source IP preservation ensure that your service connections remain intact, while Maglev Consistent Hashing enhances load balancing and resiliency. With support for Direct Server Return (DSR) and Hybrid DSR/SNAT modes, you can optimize traffic routing and improve performance.

🎯Goals & Objectives

In this article you will learn how the AKS-managed kube-proxy DaemonSet can also be disabled entirely to support BYOCNI (Bring your own CNI) and install Cilium as the CNI.

Pre-Requisites

az extension add --name aks-preview
The installed extension 'aks-preview' is in preview.

az extension update --name aks-preview
Latest version of 'aks-preview' is already installed.
Use --debug for more information
  • Ensure you have enough quota resources to create an AKS cluster. Go to the Subscription blade, navigate to “Usage + Quotas”, and make sure you have enough quota for the following resources:
    -Regional vCPUs
    -Standard Dv4 Family vCPUs
  • Register the ‘KubeProxyConfigurationPreview’ feature flag
az feature register --namespace "Microsoft.ContainerService" --name "KubeProxyConfigurationPreview"
Once the feature 'KubeProxyConfigurationPreview' is registered, invoking 'az provider register -n Microsoft.ContainerService' is required to get the change propagated
{
"id": "/subscriptions/8dbd2563-77eb-41a1-917b-5a1344da9767/providers/Microsoft.Features/providers/Microsoft.ContainerService/features/KubeProxyConfigurationPreview",
"name": "Microsoft.ContainerService/KubeProxyConfigurationPreview",
"properties": {
"state": "Registering"
},
"type": "Microsoft.Features/providers/features"

az feature show --namespace "Microsoft.ContainerService" --name "KubeProxyConfigurationPreview"
{
"id": "/subscriptions/8dbd2563-77eb-41a1-917b-5a1344da9767/providers/Microsoft.Features/providers/Microsoft.ContainerService/features/KubeProxyConfigurationPreview",
"name": "Microsoft.ContainerService/KubeProxyConfigurationPreview",
"properties": {
"state": "Registering"
},
"type": "Microsoft.Features/providers/features"
}

az feature show --namespace "Microsoft.ContainerService" --name "KubeProxyConfigurationPreview"
{
"id": "/subscriptions/8dbd2563-77eb-41a1-917b-5a1344da9767/providers/Microsoft.Features/providers/Microsoft.ContainerService/features/KubeProxyConfigurationPreview",
"name": "Microsoft.ContainerService/KubeProxyConfigurationPreview",
"properties": {
"state": "Registered"
},
"type": "Microsoft.Features/providers/features"
}

az provider register --namespace Microsoft.ContainerService
  • Utilize kube-proxyconfiguration in a new or existing AKS cluster using Azure CLI
    -kube-proxy configuration is a cluster-wide setting. No action is needed to update your services.
    -To begin with, create a JSON configuration file with the desired settings:
    -Note- enabled being set to “true/false” indicates — whether or not you want to deploy the kube-proxyDaemonSet. Default value is set to true.
{
"enabled": false,
"mode": "IPVS",
"ipvsConfig": {
"scheduler": "LeastConnection",
"TCPTimeoutSeconds": 900,
"TCPFINTimeoutSeconds": 120,
"UDPTimeoutSeconds": 300
}
}

Let’s get going

Option 1- Create a new cluster and disable kube-proxy at the time of cluster creation.

  • Create a Resource Group
 az group create -l eastus -n myResourceGroup
  • Create an AKS cluster with network-plugin set to none.
az aks create -l eastus -g myResourceGroup -n myAKSCluster --kube-proxy-config kube-proxy.json --network-plugin none

Set the Subscription

If you have multiple Azure subscriptions, choose the subscription you want to use.

  • Replace SubscriptionName with your subscription name.
  • You can also use your subscription ID instead of your subscription name.
az account set --subscription SubscriptionName

Set the Kubernetes Context

Log in to the Azure portal and browse to Kubernetes Services> select the respective Kubernetes service that was created ( AKS Cluster) and click on connect. This will help you connect to your AKS cluster and set the respective Kubernetes context.

az aks get-credentials --resource-group <resource-group-name> --name <cluster-name>

Install Cilium

  • You can get the value for API_SERVER_IP by logging in to the Azure portal and navigating to "Home" > "Kubernetes services" > select the cluster > "API server address". Set API_SERVER_PORT to 443 as that's the default port used by Azure to expose the Kubernetes API of AKS clusters.
  • Taking an example of an Azure instance the values for <k8sServiceHost> and <k8sServicePort> can also be deduced via

kubectl cluster-info
  • Set up Helm repository
helm repo add cilium https://helm.cilium.io/
export API_SERVER_IP=<value obtained above>
export API_SERVER_PORT=<value obtained above>

helm install cilium cilium/cilium --version 1.14.0 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT} \
--set aksbyocni.enabled=true \
--set nodeinit.enabled=true
  • As we can check, kube-proxy is not detected and Cilium takes over completely.
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-4ff9g 1/1 Running 0 13m
kube-system cilium-lzgv9 1/1 Running 0 13m
kube-system cilium-node-init-qf775 1/1 Running 0 13m
kube-system cilium-node-init-qltlq 1/1 Running 1 (12m ago) 13m
kube-system cilium-node-init-tvrrq 1/1 Running 0 13m
kube-system cilium-operator-6465f46b7d-9g5qd 1/1 Running 0 13m
kube-system cilium-operator-6465f46b7d-sptq6 1/1 Running 0 13m
kube-system cilium-qcvph 1/1 Running 0 13m
kube-system cloud-node-manager-4hj8w 1/1 Running 1 (12m ago) 44m
kube-system cloud-node-manager-d6v97 1/1 Running 0 44m
kube-system cloud-node-manager-x5jlk 1/1 Running 1 (15m ago) 44m
kube-system coredns-76b9877f49-lwr7r 1/1 Running 0 45m
kube-system coredns-76b9877f49-wfzl8 1/1 Running 0 13m
kube-system coredns-autoscaler-85f7d6b75d-97hkk 1/1 Running 0 45m
kube-system csi-azuredisk-node-dqhpz 3/3 Running 3 (15m ago) 44m
kube-system csi-azuredisk-node-dwx6z 3/3 Running 3 (12m ago) 44m
kube-system csi-azuredisk-node-vstnr 3/3 Running 0 44m
kube-system csi-azurefile-node-jk59s 3/3 Running 3 (12m ago) 44m
kube-system csi-azurefile-node-l6h68 3/3 Running 0 44m
kube-system csi-azurefile-node-tphpx 3/3 Running 3 (15m ago) 44m
kube-system konnectivity-agent-86df75879f-rkr84 1/1 Running 1 (15m ago) 45m
kube-system konnectivity-agent-86df75879f-zl8q4 1/1 Running 0 45m
kube-system metrics-server-5654598dc8-b9gpd 2/2 Running 0 10m
kube-system metrics-server-5654598dc8-gssrk 2/2 Running 0 10m

Option 2- Update an existing cluster that is running with kube-proxy

  • Existing AKS cluster with kube-proxy and Cilium running in BYOCNI mode.
kube-system   kube-proxy-ks5hb                      1/1     Running   0
kube-system kube-proxy-nbchx 1/1 Running 0
kube-system kube-proxy-vlw6g 1/1 Running 0
kubectl get ds -A
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR
kube-system azure-ip-masq-agent 3 3 3 3 3 <none>
kube-system cilium 3 3 3 3 3 kubernetes.io/os=linux
kube-system cilium-node-init 3 3 3 3 3 kubernetes.io/os=linux
kube-system cloud-node-manager 3 3 3 3 3 <none>
kube-system cloud-node-manager-windows 0 0 0 0 0 <none>
kube-system csi-azuredisk-node 3 3 3 3 3 <none>
kube-system csi-azuredisk-node-win 0 0 0 0 0 <none>
kube-system csi-azurefile-node 3 3 3 3 3 <none>
kube-system csi-azurefile-node-win 0 0 0 0 0 <none>
kube-system kube-proxy 3 3 3 3
  • Upgrade the AKS cluster
az aks update -g myResourceGroup -n myAKSCluster --kube-proxy-config kube-proxy.json
  • kube-proxy daemonset is thus completely removed from the cluster
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-6xthm 1/1 Running 0 7m1s
kube-system cilium-9twhw 1/1 Running 0 7m1s
kube-system cilium-node-init-7jsl9 1/1 Running 0 7m2s
kube-system cilium-node-init-d6qjr 1/1 Running 0 7m2s
kube-system cilium-node-init-tqpsq 1/1 Running 0 7m2s
kube-system cilium-operator-fdc5f8984-2r2nl 1/1 Running 0 7m1s
kube-system cilium-operator-fdc5f8984-n94r8 1/1 Running 0 7m1s
kube-system cilium-qhsbf 1/1 Running 0 7m2s
kube-system cloud-node-manager-mn95x 1/1 Running 0 20m
kube-system cloud-node-manager-tddwv 1/1 Running 0 20m
kube-system cloud-node-manager-wbl9h 1/1 Running 0 20m
kube-system coredns-autoscaler-69b7556b86-rmm9s 1/1 Running 0 5m59s
kube-system coredns-fb6b9d95f-j9ct7 1/1 Running 0 5m56s
kube-system coredns-fb6b9d95f-p24rs 1/1 Running 0 5m52s
kube-system csi-azuredisk-node-47wjg 3/3 Running 0 20m
kube-system csi-azuredisk-node-4zzg9 3/3 Running 0 20m
kube-system csi-azuredisk-node-f2vpc 3/3 Running 0 20m
kube-system csi-azurefile-node-blgpf 3/3 Running 0 20m
kube-system csi-azurefile-node-fmttr 3/3 Running 0 20m
kube-system csi-azurefile-node-xjpvf 3/3 Running 0 20m
kube-system konnectivity-agent-68cbb96955-g8t4w 1/1 Running 0 20m
kube-system konnectivity-agent-68cbb96955-mrmr4 1/1 Running 0 20m
kube-system metrics-server-5dd7f7965f-d76c2 2/2 Running 0 5m47s
kube-system metrics-server-5dd7f7965f-jk7z5 2/2 Running 0 5m43s
kubectl get ds -A
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR
kube-system cilium 3 3 3 3 3 kubernetes.io/os=linux
kube-system cilium-node-init 3 3 3 3 3 kubernetes.io/os=linux
kube-system cloud-node-manager 3 3 3 3 3 <none>
kube-system cloud-node-manager-windows 0 0 0 0 0 <none>
kube-system csi-azuredisk-node 3 3 3 3 3 <none>
kube-system csi-azuredisk-node-win 0 0 0 0 0 <none>
kube-system csi-azurefile-node 3 3 3 3 3 <none>
kube-system csi-azurefile-node-win 0 0 0 0 0 <none>
  • Upgrade Cilium to run with kube-proxy replacement (Assuming that the AKS cluster was created with BYOCNI)
helm upgrade cilium cilium/cilium --version 1.14.0 \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=${API_SERVER_PORT} \

How can we ensure that kube-proxy is not installed post a Kubernetes version upgrade?

  • You can also optionally validate that kube-proxy is not installed as an add-on a subsequent kubernetes upgrade.
  • As you can see in this example below, the AKS cluster is upgraded from k8s version 1.27 to k8s version 1.28 and we don’t see kube-proxy being enabled as an add-on.
ak aks upgrade --resource-group myResourceGroup --name myAKSCluster --kubernetes-version 1.28
Kube-Proxy is still disabled post upgrade
  • Validate that the Cilium agent is running in the desired mode
kubectl -n kube-system exec ds/cilium -- cilium status | grep KubeProxyReplacement

Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init), install-cni-binaries (init)
KubeProxyReplacement: True [eth0 10.224.0.4 fe80::6245:bdff:feda:430f (Direct Routing)]
  • Validate that kube-proxy is not present as a daemonset post the upgrade.
kubectl get ds -A

NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system cilium 3 3 3 3 3 kubernetes.io/os=linux 28m
kube-system cilium-node-init 3 3 3 3 3 kubernetes.io/os=linux 28m
kube-system cloud-node-manager 3 3 3 3 3 <none> 90m
kube-system cloud-node-manager-windows 0 0 0 0 0 <none> 90m
kube-system csi-azuredisk-node 3 3 3 3 3 <none> 90m
kube-system csi-azuredisk-node-win 0 0 0 0 0 <none> 90m
kube-system csi-azurefile-node 3 3 3 3 3 <none> 90m
kube-system csi-azurefile-node-win 0 0 0 0 0 <none> 90m

kubectl get cm -A

NAMESPACE NAME DATA AGE
default kube-root-ca.crt 1 89m
kube-node-lease kube-root-ca.crt 1 89m
kube-public kube-root-ca.crt 1 89m
kube-system cilium-config 113 27m
kube-system coredns 1 89m
kube-system coredns-autoscaler 1 26m
kube-system coredns-custom 0 89m
kube-system extension-apiserver-authentication 6 89m
kube-system kube-apiserver-legacy-service-account-token-tracking 1 89m
kube-system kube-root-ca.crt 1 89m
kube-system overlay-upgrade-data 4 89m

References

Try out Cilium

  • Try out Cilium and get a first-hand experience of how it solves some real problems and use-cases in your cloud-native or on-prem environments related to Networking, Security or Observability.

🌟Conclusion 🌟

Hopefully, this post gave you a good overview of how to install Cilium on AKS in BYOCNI mode with no kube-proxy. Thank you for Reading !! 🙌🏻😁📃, see you in the next blog.

🚀 Feel free to connect/follow with me/on :

LinkedIn: linkedin.com/in/agamitgupta

--

--