How To Sync 10,000 Argo CD Applications in One Shot, By Yourself
By Jun Duan, Robert Filepp, Paolo Dettori, Andy Anderson
In a previous article we described the results of several Argo CD scalability experiments. In this article we describe the specific steps you can take to try those same experiments at home. Step right up and put yourself in the picture!
This blog is one of a series of posts from the KubeStellar community regarding challenges related to multi-cloud and edge. You can learn more about the challenges and read posts from other members of the KubeStellar community on edge and multi-cloud topics in a post by Andy Anderson entitled Navigating the Edge: Overcoming Multi-Cloud Challenges in Edge Computing
Background.
Argo CD is an open-source continuous delivery tool that automates application deployment into Kubernetes clusters. Argo CD continuously monitors specified Git repositories for changes to application workloads, and automatically deploys changed application workloads to targeted Kubernetes clusters.
Some of the variables that might be considered when thinking about the scalability of Argo CD include: the number of applications and repositories being monitored, the number of clusters being targeted for deployment, the size of the application workloads, the frequency of changes to applications, and the periodicity of application change monitoring (which we will call the “resync period” below).
There are several questions regarding Argo CD’s scalability that come to mind such as: how many changes to a single application workload can Argo CD monitor and deploy to a limited number of clusters, how many changes to a set of application workloads can Argo CD monitor and deploy to a limited number of clusters, what happens if we change the sizes of those workloads, what happens if we change the resync period, how many different Git repos and/or applications can Argo CD monitor, how many clusters can Argo CD deploy a single application to, how many different applications can Argo CD monitor and deploy to a large number of clusters within a given timeframe.
We’re going to describe how we conducted experiments to explore a few of these questions.
The Experiment Environment
Our experiment was originally designed to be run in the context of a project called “kyst”. and it uses two Custom Resource Definitions (CRDs), ConfigSpec and DeviceGroup. So we needed to write a wrapper to translate between standard Kubernetes api resources and the kyst CRDs, as well as to assign unique ids to the Argo CD Application ConfigSpecs (there will eventually be 10,000 of these). The wrapper is currently packaged as a Docker image and deployed as a plugin to Argo CD . If you are uncomfortable loading an image into your environment, you could try changing the config.yaml to objectTemplatePath: “test-application-no-plugin.yaml”. However we have not tried this at scale and don’t know how it will behave.
We used two GitHub repositories. One repository contains a specification that will cause custom plugin code to be deployed from docker.io/junatibm/wrap4kyst:latest to a container in the cluster that hosts the Argo CD server. The second GitHub repository hosts the application content to be delivered by Argo CD.
As shown below, we use AWS as the main infrastructure. We create two EC2 instances, and then bootstrap two single-node clusters correspondingly. One cluster is used to host the Argo CD server, Prometheus, which is used to collect metrics, and Grafana, which we use for visualization of metrics. The other cluster will host the Application deployment targets.
We use ClusterLoader 2, running on our local machine, to load Application resources into the Argo CD server.
Prepare Infrastructure on AWS
We create the EC2 instances, then bootstrap the Kubernetes clusters using kubeadm manually.
Surely it is not necessary to do this manually. But if that happens to be the case, here are a few tips in addition to the official documentation.
Create the two EC2 instances within the same subnet, so that we don’t have to further configure the communications between them.
Provision at least 32 GiB of storage for each EC2 instance.
We will access the clusters remotely (from outside AWS), e.g., from our laptops. To make that possible, we need to 1) assign public IPs to the EC2 instances, 2) pass the public IPs to kubeadm (specified as follows).
Note regarding console examples in this article: command lines are prefaced with ‘$’ prompt, lines without the ‘$’ prefix are console messages.
Find the public IP, either from AWS console or by some small trick.
$ curl ifconfig.me
Initialize the Kubernetes clusters with the apiserver-cert-extra-sans flag.
$ kubeadm init — pod-network-cidr=10.244.0.0/16 — apiserver-cert-extra-sans=<the-public-ip>
Remove the ‘control-plane’ taint from the single node. Note the trailing’-’.
$ kubectl taint node <node-name> node-role.kubernetes.io/control-plane-
Use the local-path-provisioner, then annotate the standard storage class.
$ kubectl patch storageclass standard -p ‘{“metadata”: {“annotations”:{“storageclass.kubernetes.io/is-default-class”:”true”}}}’
Install Argo CD
The installation of Argo CD is well documented here. In this section, we just follow the steps.
Install Argo CD into the 1st cluster.
$ kubectl create namespace argocd
$ kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
Download the Argo CD CLI.
$ curl -LO https://github.com/argoproj/argo-cd/releases/download/v2.6.7/argocd-linux-amd64
$ sudo mv argocd-linux-amd64 /usr/local/bin/argocd
$ chmod +x /usr/local/bin/argocd
. Get the initial admin password
$ argocd admin initial-password -n argocd
Find the ‘argocd-server’ service. Identify its cluster-IP and ports.
$ kubectl -n argocd get svc argocd-server
Login to argocd-server with the CLI and the password.
$ argocd login 10.109.209.54:80
WARNING: server certificate had error: x509: cannot validate certificate for 10.109.209.54 because it doesn’t contain any IP SANs. Proceed insecurely (y/n)? y
Username: admin
Password:
‘admin:login’ logged in successfully
Context ‘10.109.209.54:80’ updated
Optionally, change the password.
$ argocd account update-password
*** Enter password of currently logged in user (admin):
*** Enter new password for user admin:
*** Confirm new password for user admin:
Password updated
Context ‘10.109.209.54:80’ updated
Access Argo CD Remotely
We can access Argo CD via 1) the Argo CD UI, 2) the Argo CD CLI, 3) the Kubernetes API.
Change the argocd-server service type.
$ kubectl patch svc argocd-server -n argocd -p ‘{“spec”: {“type”: “LoadBalancer”}}’
Now the argocd-server service has node ports.
$ kubectl -n argocd get svc argocd-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 10.109.209.54 <pending> 80:30112/TCP,443:30085/TCP 28m
Optionally, overwrite the randomly assigned node ports. For example, use 30080.
$ kubectl -n argocd edit svc argocd-server
service/argocd-server edited
$ kubectl -n argocd get svc argocd-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 10.109.209.54 <pending> 80:30080/TCP,443:30085/TCP 103m
In AWS console, edit the inbound rules for the corresponding security group to allow traffic to the node port.
Now Argo CD can be reached remotely using the 1st EC2 instance’s public IP and the node port. It is handy to add the public IP to /etc/hosts.
We can test the remote access via the Argo CD UI.
Download the Argo CD CLI again, to our laptop, from https://github.com/argoproj/argo-cd/releases. For a Mac user, the command line is similar to:
$ curl -Lo <somewhere-in-PATH>/argocd https://github.com/argoproj/argo-cd/releases/download/v2.6.7/argocd-darwin-amd64
Login to argocd-server with the CLI again, from our laptop.
$ argocd login argocd:30080
WARNING: server certificate had error: x509: certificate signed by unknown authority. Proceed insecurely (y/n)? y
Username: admin
Password:
‘admin:login’ logged in successfully
Context ‘argocd:30080’ updated
In AWS console, edit the inbound rules for the corresponding security group to allow traffic to 6443.
Copy the kubeconfig file of the 1st cluster out of the 1st EC2 instance to our laptop.
$ scp argocd:~/.kube/config ~/.kube/config_argocd
config 100% 5637 102.4KB/s 00:00
Edit the kubeconfig file to make sure the server address is using the public IP of the 1st EC2 instance, then try to access the apiserver.
$ vi ~/.kube/config_argocd
$ export KUBECONFIG=~/.kube/config_argocd
$ kubectl get ns argocd
NAME STATUS AGE
argocd Active 153m
Add the 2nd Cluster as Applications’ Destination
Identify the kubeconfig file (of the 2nd cluster) on the 2nd EC2 instance. We need this file to add the 2nd Kubernetes cluster to Argo CD.
Login to the 1st EC2 instance, and make the kubeconfig file (of the 2nd cluster) available there.
Add the 2nd cluster to Argo CD.
$ argocd cluster add kubernetes-admin@kubernetes — kubeconfig dest_kubeconfig.yaml
WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `kubernetes-admin@kubernetes` with full cluster level privileges. Do you want to continue [y/N]? y
INFO[0003] ServiceAccount “argocd-manager” created in namespace “kube-system”
INFO[0003] ClusterRole “argocd-manager-role” created
INFO[0003] ClusterRoleBinding “argocd-manager-role-binding” created
INFO[0008] Created bearer token secret for ServiceAccount “argocd-manager”
Cluster ‘https://172.31.30.2:6443' added
Optionally, check the added cluster from the Argo CD UI.
Use the Argo CD Plugin
This section uses code in https://github.com/edge-experiments/argocd-scalability.
Install CRDs into the 2nd cluster.
$ kubectl apply -f plugin-usage/crds/edge.kyst.kube_configspecs.yaml
customresourcedefinition.apiextensions.k8s.io/configspecs.edge.kyst.kube created
$ kubectl apply -f plugin-usage/crds/edge.kyst.kube_devicegroups.yaml
customresourcedefinition.apiextensions.k8s.io/devicegroups.edge.kyst.kube created
The Argo CD Plugin is served as a built image from Jun Duan’s Docker Hub account. We can check plugin-usage/argocd-repo-server-patch.yaml to identify the image.
image: docker.io/junatibm/wrap4kyst:latest
Patch argocd-repo-server against the 1st cluster.
$ kubectl patch -n argocd deployment argocd-repo-server — patch-file=plugin-usage/argocd-repo-server-patch.yaml
deployment.apps/argocd-repo-server patched
Give some time to argocd-repo-server for its restart.
$ kubectl -n argocd get po
NAME READY STATUS RESTARTS AGE
argocd-application-controller-0 1/1 Running 0 5h12m
argocd-applicationset-controller-57db5f5c7d-sh69z 1/1 Running 0 5h12m
argocd-dex-server-c4b8545d-tlggh 1/1 Running 0 5h12m
argocd-notifications-controller-7cddc64d84-b25tf 1/1 Running 0 5h12m
argocd-redis-6b7c6f67db-knpdp 1/1 Running 0 5h12m
argocd-repo-server-5c6bcd6ff4-fxrxm 1/1 Running 0 6m18s
argocd-server-64957744c9–7wcvq 1/1 Running 0 5h12m
Register the ‘wrap4kyst’ Argo CD plugin.
$ kubectl patch -n argocd configmap argocd-cm — patch-file plugin-usage/argocd-cm-patch.yaml
configmap/argocd-cm patched
Test the plugin as follows.
Create an Argo CD application.
$ argocd app create first-app \
— config-management-plugin wrap4kyst \
— repo https://github.com/edge-experiments/gitops-source.git \
— path kubernetes/guestbook/deploy \
— dest-server https://172.31.30.2:6443 \
— dest-namespace default
application ‘first-app’ created
Sync the Argo CD application.
$ argocd app sync first-app
Name: argocd/first-app
Project: default
Server: https://172.31.30.2:6443
Namespace: default
URL: https://argocd:30080/applications/first-app
Repo: https://github.com/edge-experiments/gitops-source.git
Target:
Path: kubernetes/guestbook/deploy
SyncWindow: Sync Allowed
Sync Policy: <none>
Sync Status: Synced to (31df591)
Health Status: Healthy
Operation: Sync
Sync Revision: 31df591a1ab16ff9cf218ab480d051aee0e8abb5
Phase: Succeeded
Start: 2023–03–29 16:30:36 -0400 EDT
Finished: 2023–03–29 16:30:36 -0400 EDT
Duration: 0s
Message: successfully synced (all tasks run)
GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE
edge.kyst.kube ConfigSpec default guestbook Synced configspec.edge.kyst.kube/guestbook created
edge.kyst.kube DeviceGroup default guestbook1 Synced devicegroup.edge.kyst.kube/guestbook1 created
Delete the Argo CD application.
$ argocd app delete first-app
Are you sure you want to delete ‘first-app’ and all its resources? [y/n] y
application ‘first-app’ deleted
To be ready for the next step (use Cluster Loader 2), we need to change ‘server’ in clusterloader-manifests/test-application-use-plugin.yaml. For example:
server: https://172.31.30.2:6443 # change this for each run
Use Cluster Loader 2
This section uses code in https://github.com/kubernetes/perf-tests.
The perf-tests repository contains multiple sub projects. We are going to use Cluster Loader 2.
$ cd clusterloader2/
Use a specific commit.
$ git checkout 71bfedf3b50f770f64453f537e91eb41452358dd
Previous HEAD position was 8a0c339a Merge pull request #2101 from jprzychodzen/perfdash-adc
HEAD is now at 71bfedf3 Merge pull request #2032 from wojtek-t/unify_pod_sizes_3
Insert one line at line #180, into pkg/test/simple_test_executor.go.
nsList = []string{“argocd”}
On the 2nd cluster, create a namespace for Argo CD Applications’ workload.
$ kubectl create ns argocd-scalability
Run Cluster Loader 2.
$ go run cmd/clusterloader.go \
— testconfig <path-to-the-argocd-scalability-repo>/clusterloader-manifests/config.yaml \
— provider local \
— kubeconfig ~/.kube/config_argocd \
— v=2 \
— enable-exec-service=false \
— delete-automanaged-namespaces=false
This creates 2k Argo CD Applications.
Setup Monitoring
This section uses code in https://github.com/prometheus-operator/kube-Prometheus.
Follow the Quickstart.
$ kubectl apply — server-side -f manifests/setup
$ kubectl wait \
— for condition=Established \
— all CustomResourceDefinition \
— namespace=monitoring
$ kubectl apply -f manifests/
Edit two locations in the ‘grafana’ service.
$ kubectl -n monitoring edit svc grafana
First, change its type to node port.
type: NodePort
Second, add a node port, say 30000, for http.
ports:
— name: http
nodePort: 30000
port: 3000
protocol: TCP
targetPort: http
In AWS console, edit the inbound rules for the corresponding security group to allow traffic to 30000.
Now we can access Grafana UI from port 30000.
Use admin/admin to login, and change the admin password.
Use Argo CD Dashboard in Grafana
This section uses code in https://github.com/edge-experiments/argocd-scalability.
Setup RBAC.
$ kubectl apply -f monitoring/rbac.yaml
Create service monitors for Argo CD.
$ kubectl apply -f monitoring/service-monitors.yaml
Import Argo CD Dashboard from monitoring/grafana-dashboards/ArgoCD-1659038021207.json.
Checkout the panels After a few minutes.
Trigger a One-shot Sync for all Applications
This section uses code in https://github.com/edge-experiments/gitops-source. As the name suggests, this repository holds the content to be delivered for Argo CD. We need to add new commits to the repository to trigger the sync. Therefore, it is strongly recommend to fork the repository and use the fork instead — unless you have the authors’ mobile numbers and call them to promptly merge your PR.
Remember to change the repoURL in clusterloader-manifests/test-application-use-plugin.yaml after the fork.
Push a commit to the main branch of the fork. The commit does not need to introduce changes to the to-be-delivered manifests. Even an empty commit can do the job.
Check the dynamics of the sync status from the Argo CD UI.
Observe the Grafana dashboard.
It takes roughly 7 minutes to sync 2k Apps in one-shot.
Create More Applications in Batches
This section uses code in https://github.com/edge-experiments/argocd-scalability and https://github.com/kubernetes/perf-tests.
In https://github.com/edge-experiments/argocd-scalability, there are tunable knobs (marked by ‘change this XXX for each run’ comment) in two files.
Tune knobs in clusterloader-manifests/config.yaml.
Change the number of Applications in one batch.
replicasPerNamespace: 2000 # change this number for each run
Edit ‘basename’ for Applications.
- basename: application-batch-1 # changes this basename for each run
Tune knobs in clusterloader-manifests/test-application-use-plugin.yaml.
Edit ‘batch’, which facilitates the management of Applications by label.
batch: “1” # changes this for each run
Change the size of workload of an Application, by specifying which registered Argo CD plugin to use.
name: wrap4kyst-scalability-heavily-loaded # changes this for each run
In https://github.com/kubernetes/perf-tests, use Cluster Loader 2 in the same way.
$ cd clusterloader2/
Run Cluster Loader 2, with the same command.
$ go run cmd/clusterloader.go \
— testconfig <path-to-the-argocd-scalability-repo>/clusterloader-manifests/config.yaml \
— provider local \
— kubeconfig ~/.kube/config_argocd \
— v=2 \
— enable-exec-service=false \
— delete-automanaged-namespaces=false
Change the Resync Period
This section uses code in https://github.com/edge-experiments/argocd-scalability.
The “resync period” is what we call the periodicity of application change monitoring. It is how frequently ArgoCD checks the Git repo for changes to the applications. This period is controlled by ArgoCD’s timeout.reconciliation configuration parameter.
Set the resync period to 6 minutes.
$ kubectl patch -n argocd configmap argocd-cm — patch-file plugin-usage/argocd-cm-patch.yaml
configmap/argocd-cm patched
The application controller needs a restart to pick up the new resync period.
Restart the applicationset controller by scaling replicas to 0 and then back to 1:
$ kubectl -n argocd scale — replicas=0 deploy argocd-applicationset-controller
deployment.apps/argocd-applicationset-controller scaled
$ kubectl -n argocd scale — replicas=1 deploy argocd-applicationset-controller
deployment.apps/argocd-applicationset-controller scaled
Sync 10,000 Argo CD Applications
It took about 40 minutes to sync 10k Applications.
Closing Remarks
This post was intended to show you the details of how you might recreate the experiments with scaling Argo CD described in our previous article. The experiments were motivated by our work on application lifecycle management of edge computing. Once again, this blog is part of a series of posts from the KubeStellar community regarding challenges related to multi-cloud and edge. You can learn more about the challenges and read posts from other members of the KubeStellar community on edge and multi-cloud topics in a post by Andy Anderson entitled Navigating the Edge: Overcoming Multi-Cloud Challenges in Edge Computing.
Thanks for reading!
FYI
Andy Anderson and Jun Duan will be speaking on this general subject at GitOpsCon on May 9, 2023!
Check out #ibmresearch #scale