5-Step Approach: Projectsveltos for Kubernetes Cluster Autoscaler Deployment on Hybrid-Cloud
Introduction
In the previous post, we talked about the Projectsveltos, and how it can be installed and used as a way to deploy Kubernetes add-on deployments on different environments, independent if they are on-prem or in the Cloud. The example provided was based on the Rancher Kubernetes Engine Government (RKE2) infrastructure in an on-prem environment.
Today, we will take a slightly different approach. We will work with a hybrid cloud setup. The cluster04 will act as our Sveltos management cluster, and the Kubernetes Cluster Autoscaler will get deployed to a Linode Kubernetes Engine (LKE) cluster with the assistance of Sveltos.
Diagram
Prerequisites
For this demonstration, I have already installed ArgoCD, deployed Sveltos to cluster04 and created an LKE cluster. For the first two, follow step 1 and step 2 from my previous post. If you want to learn how to create an LKE cluster, check out the link.
Note: While creating the LKE cluster, the below Node pools created.
Lab Setup
+-----------------+----------------------+-----------------------+
| Cluster Name | Type | Version |
+-----------------+----------------------+-----------------------+
| cluster04 | Management Cluster | RKE2 v1.26.11+rke2r1 |
| linode-autoscaler| Managed k8s Cluster | k8s v1.27.8 |
+-----------------+----------------------+-----------------------+
Step 1: Register the LKE Cluster with Sveltos
Once the LKE cluster is in a “Running” state, we will use the sveltosctl to register it. For the registration, we need three things: a service account, a kubeconfig associated with that account and a namespace. If you are unsure how to create a Service Account and an associated kubeconfig, there is a script publicly available to help you out.
Registration
$ sveltosctl register cluster --namespace=projectsveltos --cluster=linode-autoscaler --kubeconfig=linode-autoscaler.yaml
Verification
$ kubectl get sveltoscluster -n projectsveltos
NAME READY VERSION
cluster01 true v1.26.6+rke2r1
linode-autoscaler true v1.27.8
Step 2: Cluster Labelling
To deploy and manage Kubernetes add-ons with the help of Sveltos, the concept of ClusterProfile and cluster labelling comes into play. ClusterProfile is the CustomerResourceDefinition used to instruct Sveltos which add-ons to deploy on a set of clusters.
For this demonstration, we will set the unique label “env=autoscaler” as we want to differentiate this cluster from the other already existing cluster01.
$ kubectl get sveltoscluster -n projectsveltos --show-labels
NAME READY VERSION LABELS
cluster01 true v1.26.6+rke2r1 env=test,sveltos-agent=present
linode-autoscaler true v1.27.8 sveltos-agent=present
Add labels
$ kubectl label sveltosclusters linode-autoscaler env=autoscaler -n projectsveltos
Verification
$ kubectl get sveltoscluster -n projectsveltos --show-labels
NAME READY VERSION LABELS
cluster01 true v1.26.6+rke2r1 env=test,sveltos-agent=present
linode-autoscaler true v1.27.8 env=autoscaler,sveltos-agent=present
Step 3: Kubernetes Cluster Autoscaler ClusterProfile
The Cluster Autoscaler will get deployed with the Helm chart. Apart from that, we need a Kubernetes secret to store the cloud-config for the Linode environment.
The whole deployment process is orchestrated and managed by Sveltos. The secret which contains the needed information alongside the Helm chart, will get deployed with the Sveltos ClusterProfile.
Note: The configuration is done on cluster04 as cluster04 is our Sveltos management cluster.
Secret
---
apiVersion: v1
kind: Secret
metadata:
name: cluster-autoscaler-cloud-config
namespace: autoscaler
type: Opaque
stringData:
cloud-config: |-
[global]
linode-token={Your own Linode Token}
lke-cluster-id=147978
defaut-min-size-per-linode-type=1
defaut-max-size-per-linode-type=2
[nodegroup "g6-standard-1"]
min-size=1
max-size=2
[nodegroup "g6-standard-2"]
min-size=1
max-size=2
- “linode-token”: This will get replaced with your own token. To generate a Linode token check the link here.
- “lke-cluster-id”: You can get the ID of the LKE cluster with the following CURL request, curl -H “Authorization: Bearer $TOKEN” https://api.linode.com/v4/lke/clusters
- “nodegroup”: As mentioned above, the existing LKE cluster has two node pools. The name of the node pools is defined as the nodegroups in the secret above.
Note: The min and max numbers of the nodegroups can be customised based on your needs. As this is a demo environment, I wanted to keep the cost of the deployment as low as possible.
Create Secret (cluster04)
Before we can create the secret in a Sveltos ClusterProfile, first we need to create the secret locally and add the type “addons.projectsveltos.io/cluster-profile”.
$ kubectl create secret generic cluster-autoscaler-cloud-config --from-file=secret.yaml --type=addons.projectsveltos.io/cluster-profile
$ kubectl get secret
NAME TYPE DATA AGE
cluster-autoscaler-cloud-config addons.projectsveltos.io/cluster-profile 1 2m25s
Create ClusterProfile
---
apiVersion: config.projectsveltos.io/v1alpha1
kind: ClusterProfile
metadata:
name: cluster-autoscaler
spec:
clusterSelector: env=autoscaler
syncMode: Continuous
policyRefs:
- deploymentType: Remote
kind: Secret
name: cluster-autoscaler-cloud-config
namespace: default
helmCharts:
- chartName: autoscaler/cluster-autoscaler
chartVersion: v9.34.1
helmChartAction: Install
releaseName: autoscaler-latest
releaseNamespace: autoscaler
repositoryName: autoscaler
repositoryURL: https://kubernetes.github.io/autoscaler
helmChartAction: Install
values: |
autoDiscovery:
clusterName: linode-autoscaler
cloudProvider: linode
extraVolumeSecrets:
cluster-autoscaler-cloud-config:
mountPath: /config
name: cluster-autoscaler-cloud-config
extraArgs:
logtostderr: true
stderrthreshold: info
v: 2
cloud-config: /config/cloud-config
image:
pullPolicy: IfNotPresent
pullSecrets: []
repository: registry.k8s.io/autoscaling/cluster-autoscaler
tag: v1.28.2
- “policyRefs”: Will create the secret with the name “cluster-autoscaler-cloud-config” in the cluster with the Sveltos tag set to“env=autoscaler” and in the namespace “autoscaler”. Note: the ‘namespace: default’ refers to the management cluster. Where cluter04 should look for the secret. In this case, it is the default namespace.
- “helmCharts”: Install the latest Cluster Autoscaler Helm chart
- “values”: Overwrite the default Helm values with the ones required from the Linode cloud provider. More details can be found here.
Apply ClusterProfile (cluster04)
$ kubectl apply -f "cluster-autoscaler.yaml"
Verification
$ sveltosctl show addons
+----------------------------------+---------------+------------+---------------------------------+---------+-------------------------------+--------------------+
| CLUSTER | RESOURCE TYPE | NAMESPACE | NAME | VERSION | TIME | CLUSTER PROFILES |
+----------------------------------+---------------+------------+---------------------------------+---------+-------------------------------+--------------------+
| projectsveltos/linode-autoscaler | helm chart | autoscaler | autoscaler-latest | 9.34.1 | 2024-01-05 12:15:54 +0000 UTC | cluster-autoscaler |
| projectsveltos/linode-autoscaler | :Secret | autoscaler | cluster-autoscaler-cloud-config | N/A | 2024-01-05 12:16:03 +0000 UTC | cluster-autoscaler |
+----------------------------------+---------------+------------+---------------------------------+---------+-------------------------------+--------------------+
Verification - linode-autoscaler
$ kubectl get pods -n autoscaler
NAME READY STATUS RESTARTS AGE
autoscaler-latest-linode-cluster-autoscaler-5fd8bfbb6f-r5zf6 1/1 Running 0 5m25s
Step 4: Cluster Multi Replica Deployment
We will use Sveltos to create a busybox deployment on the LKE cluster with 500 replicas.
Create Deployment and Configmap Resource (cluster04)
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox-workload
namespace: default
labels:
app: busybox
spec:
replicas: 500
strategy:
type: RollingUpdate
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: busybox
imagePullPolicy: IfNotPresent
command: ['sh', '-c', 'echo Demo Workload ; sleep 600']
$ kubectl create configmap busybox --from-file=busybox.yaml
Create ClusterProfile for busybox Deployment (cluster04)
---
apiVersion: config.projectsveltos.io/v1alpha1
kind: ClusterProfile
metadata:
name: busybox
spec:
clusterSelector: env=autoscaler
policyRefs:
- kind: ConfigMap
name: busybox
namespace: default
$ kubectl apply -f "clusterprofile-busybox-autoscaler.yaml"
$ sveltosctl show addons --clusterprofile=busybox
+--------------------------+-----------------+-----------+------------------+---------+-------------------------------+------------------+
| CLUSTER | RESOURCE TYPE | NAMESPACE | NAME | VERSION | TIME | CLUSTER PROFILES |
+--------------------------+-----------------+-----------+------------------+---------+-------------------------------+------------------+
| projectsveltos/linode-autoscaler | apps:Deployment | default | busybox-workload | N/A | 2024-01-05 12:25:10 +0000 UTC | busybox |
+--------------------------+-----------------+-----------+------------------+---------+-------------------------------+------------------+
As the cluster has only two small nodes, it cannot keep up with the load of the 500 replicas. The autoscaler will kick in and add the required size of the nodes available from the pool.
Verification
$ kubernetes logs autoscaler-latest-linode-cluster-autoscaler-5fd8bfbb6f-r5zf6 -n autoscaler -f
I0103 17:39:49.638775 1 linode_manager.go:84] LKE node group after refresh:
I0103 17:39:49.638804 1 linode_manager.go:86] node group ID g6-standard-1 := min: 1, max: 2, LKEClusterID: 148988, poolOpts: {Count:1 Type:g6-standard-1 Disks:[]}, associated LKE pools: { poolID: 218632, count: 1, type: g6-standard-1, associated linodes: [ID: "218632-48499cbb0000", instanceID: 53677574] }
I0103 17:39:49.638826 1 linode_manager.go:86] node group ID g6-standard-2 := min: 1, max: 2, LKEClusterID: 148988, poolOpts: {Count:1 Type:g6-standard-2 Disks:[]}, associated LKE pools: { poolID: 218633, count: 1, type: g6-standard-2, associated linodes: [ID: "218633-4d4d521d0000", instanceID: 53677576] }
I0103 17:39:49.639930 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-f558s is unschedulable
I0103 17:39:49.639939 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-rfhqp is unschedulable
I0103 17:39:49.639943 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-wsfcw is unschedulable
I0103 17:39:49.639946 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-25s5c is unschedulable
I0103 17:39:49.639948 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-r7fv4 is unschedulable
I0103 17:39:49.639951 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-nmbxc is unschedulable
I0103 17:39:49.639954 1 klogx.go:87] Pod default/busybox-workload-5d94965f98-dbrh4 is unschedulable
E0103 17:39:49.640116 1 orchestrator.go:450] Couldn't get autoscaling options for ng: g6-standard-1
E0103 17:39:49.640137 1 orchestrator.go:450] Couldn't get autoscaling options for ng: g6-standard-2
I0103 17:39:49.640587 1 orchestrator.go:185] Best option to resize: g6-standard-1
I0103 17:39:49.641940 1 orchestrator.go:189] Estimated 1 nodes needed in g6-standard-1
I0103 17:39:49.641961 1 orchestrator.go:295] Final scale-up plan: [{g6-standard-1 1->2 (max: 2)}]
I0103 17:39:49.641972 1 executor.go:147] Scale-up: setting group g6-standard-1 size to 2
Step 5: Undeploy Busybox
Finally, once the tests are complete, it is time to remove the busybox deployment from the cluster and allow 10 good minutes for the autoscaler to scale down the unneeded nodes. For the undeploy process, we only need to delete the busybox ClusterProfile.
Undeploy (LKE)
$ kubectl delete -f "clusterprofile-busybox-autoscaler.yaml"
Verification
$ kubernetes logs autoscaler-latest-linode-cluster-autoscaler-5fd8bfbb6f-r5zf6 -n autoscaler -f
I0103 18:02:27.179597 1 linode_manager.go:84] LKE node group after refresh:
I0103 18:02:27.179813 1 linode_manager.go:86] node group ID g6-standard-1 := min: 1, max: 2, LKEClusterID: 148988, poolOpts: {Count:1 Type:g6-standard-1 Disks:[]}, associated LKE pools: { poolID: 218632, count: 1, type: g6-standard-1, associated linodes: [ID: "218632-48499cbb0000", instanceID: 53677574] }
I0103 18:02:27.179840 1 linode_manager.go:86] node group ID g6-standard-2 := min: 1, max: 2, LKEClusterID: 148988, poolOpts: {Count:1 Type:g6-standard-2 Disks:[]}, associated LKE pools: { poolID: 218633, count: 1, type: g6-standard-2, associated linodes: [ID: "218633-4d4d521d0000", instanceID: 53677576] }
I0103 18:02:27.181433 1 static_autoscaler.go:547] No unschedulable pods
I0103 18:02:27.181449 1 pre_filtering_processor.go:67] Skipping lke148988-218632-48499cbb0000 - node group min size reached (current: 1, min: 1)
I0103 18:02:27.181623 1 pre_filtering_processor.go:67] Skipping lke148988-218633-4d4d521d0000 - node group min size reached (current: 1, min: 1)
I0103 18:02:36.040964 1 node_instances_cache.go:156] Start refreshing cloud provider node instances cache
I0103 18:02:36.040995 1 node_instances_cache.go:168] Refresh cloud provider node instances cache finished, refresh took 12.52µs
I0103 18:02:38.019810 1 linode_manager.go:84] LKE node group after refresh:
I0103 18:02:38.019983 1 linode_manager.go:86] node group ID g6-standard-2 := min: 1, max: 2, LKEClusterID: 148988, poolOpts: {Count:1 Type:g6-standard-2 Disks:[]}, associated LKE pools: { poolID: 218633, count: 1, type: g6-standard-2, associated linodes: [ID: "218633-4d4d521d0000", instanceID: 53677576] }
I0103 18:02:38.020015 1 linode_manager.go:86] node group ID g6-standard-1 := min: 1, max: 2, LKEClusterID: 148988, poolOpts: {Count:1 Type:g6-standard-1 Disks:[]}, associated LKE pools: { poolID: 218632, count: 1, type: g6-standard-1, associated linodes: [ID: "218632-48499cbb0000", instanceID: 53677574] }
I0103 18:02:38.021154 1 static_autoscaler.go:547] No unschedulable pods
I0103 18:02:38.021181 1 pre_filtering_processor.go:67] Skipping lke148988-218632-48499cbb0000 - node group min size reached (current: 1, min: 1)
I0103 18:02:38.021187 1 pre_filtering_processor.go:67] Skipping lke148988-218633-4d4d521d0000 - node group min size reached (current: 1, min: 1)
As expected, Sveltos took care of the complete lifecycle of the different Kubernetes deployments in a simple and straightforward manner!
Resources
- Sveltos: https://projectsveltos.github.io/sveltos/
- Autoscaler: https://github.com/kubernetes/autoscaler
Contact
We are here to help! Whether you have questions, issues or need assistance, our Slack channel is the perfect place for you. Click here to join us.
👏 Support this project
Every contribution counts! If you enjoyed this article, check out the Projectsveltos GitHub repo. You can star 🌟 the project if you find it helpful.
The GitHub repo is a great resource for getting started with the project. It contains the code, documentation, and many more examples.
Thanks for reading!