How To Sync 10,000 Argo CD Applications in One Shot, By Yourself

Robert Filepp
12 min readMay 5, 2023

--

Step right up and put yourself in the picture! On Your Mark, Get Set, Go!

By Jun Duan, Robert Filepp, Paolo Dettori, Andy Anderson

In a previous article we described the results of several Argo CD scalability experiments. In this article we describe the specific steps you can take to try those same experiments at home. Step right up and put yourself in the picture!

This blog is one of a series of posts from the KubeStellar community regarding challenges related to multi-cloud and edge. You can learn more about the challenges and read posts from other members of the KubeStellar community on edge and multi-cloud topics in a post by Andy Anderson entitled Navigating the Edge: Overcoming Multi-Cloud Challenges in Edge Computing

Background.

Argo CD is an open-source continuous delivery tool that automates application deployment into Kubernetes clusters. Argo CD continuously monitors specified Git repositories for changes to application workloads, and automatically deploys changed application workloads to targeted Kubernetes clusters.

Some of the variables that might be considered when thinking about the scalability of Argo CD include: the number of applications and repositories being monitored, the number of clusters being targeted for deployment, the size of the application workloads, the frequency of changes to applications, and the periodicity of application change monitoring (which we will call the “resync period” below).

There are several questions regarding Argo CD’s scalability that come to mind such as: how many changes to a single application workload can Argo CD monitor and deploy to a limited number of clusters, how many changes to a set of application workloads can Argo CD monitor and deploy to a limited number of clusters, what happens if we change the sizes of those workloads, what happens if we change the resync period, how many different Git repos and/or applications can Argo CD monitor, how many clusters can Argo CD deploy a single application to, how many different applications can Argo CD monitor and deploy to a large number of clusters within a given timeframe.

We’re going to describe how we conducted experiments to explore a few of these questions.

The Experiment Environment

Our experiment was originally designed to be run in the context of a project called “kyst”. and it uses two Custom Resource Definitions (CRDs), ConfigSpec and DeviceGroup. So we needed to write a wrapper to translate between standard Kubernetes api resources and the kyst CRDs, as well as to assign unique ids to the Argo CD Application ConfigSpecs (there will eventually be 10,000 of these). The wrapper is currently packaged as a Docker image and deployed as a plugin to Argo CD . If you are uncomfortable loading an image into your environment, you could try changing the config.yaml to objectTemplatePath: “test-application-no-plugin.yaml”. However we have not tried this at scale and don’t know how it will behave.

We used two GitHub repositories. One repository contains a specification that will cause custom plugin code to be deployed from docker.io/junatibm/wrap4kyst:latest to a container in the cluster that hosts the Argo CD server. The second GitHub repository hosts the application content to be delivered by Argo CD.

As shown below, we use AWS as the main infrastructure. We create two EC2 instances, and then bootstrap two single-node clusters correspondingly. One cluster is used to host the Argo CD server, Prometheus, which is used to collect metrics, and Grafana, which we use for visualization of metrics. The other cluster will host the Application deployment targets.

We use ClusterLoader 2, running on our local machine, to load Application resources into the Argo CD server.

Environment Architecture (clusters managed by Kubernetes)

Prepare Infrastructure on AWS

We create the EC2 instances, then bootstrap the Kubernetes clusters using kubeadm manually.

Surely it is not necessary to do this manually. But if that happens to be the case, here are a few tips in addition to the official documentation.

Create the two EC2 instances within the same subnet, so that we don’t have to further configure the communications between them.

Provision at least 32 GiB of storage for each EC2 instance.

We will access the clusters remotely (from outside AWS), e.g., from our laptops. To make that possible, we need to 1) assign public IPs to the EC2 instances, 2) pass the public IPs to kubeadm (specified as follows).

Note regarding console examples in this article: command lines are prefaced with ‘$’ prompt, lines without the ‘$’ prefix are console messages.

Find the public IP, either from AWS console or by some small trick.

$ curl ifconfig.me

Initialize the Kubernetes clusters with the apiserver-cert-extra-sans flag.

$ kubeadm init — pod-network-cidr=10.244.0.0/16 — apiserver-cert-extra-sans=<the-public-ip>

Remove the ‘control-plane’ taint from the single node. Note the trailing’-’.

$ kubectl taint node <node-name> node-role.kubernetes.io/control-plane-

Use the local-path-provisioner, then annotate the standard storage class.

$ kubectl patch storageclass standard -p ‘{“metadata”: {“annotations”:{“storageclass.kubernetes.io/is-default-class”:”true”}}}’

Install Argo CD

The installation of Argo CD is well documented here. In this section, we just follow the steps.

Install Argo CD into the 1st cluster.

$ kubectl create namespace argocd
$ kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

Download the Argo CD CLI.

$ curl -LO https://github.com/argoproj/argo-cd/releases/download/v2.6.7/argocd-linux-amd64
$ sudo mv argocd-linux-amd64 /usr/local/bin/argocd
$ chmod +x /usr/local/bin/argocd

. Get the initial admin password

$ argocd admin initial-password -n argocd

Find the ‘argocd-server’ service. Identify its cluster-IP and ports.

$ kubectl -n argocd get svc argocd-server

Login to argocd-server with the CLI and the password.

$ argocd login 10.109.209.54:80
WARNING: server certificate had error: x509: cannot validate certificate for 10.109.209.54 because it doesn’t contain any IP SANs. Proceed insecurely (y/n)? y
Username: admin
Password:
‘admin:login’ logged in successfully
Context ‘10.109.209.54:80’ updated

Optionally, change the password.

$ argocd account update-password
*** Enter password of currently logged in user (admin):
*** Enter new password for user admin:
*** Confirm new password for user admin:
Password updated
Context ‘10.109.209.54:80’ updated

Access Argo CD Remotely

We can access Argo CD via 1) the Argo CD UI, 2) the Argo CD CLI, 3) the Kubernetes API.

Change the argocd-server service type.

$ kubectl patch svc argocd-server -n argocd -p ‘{“spec”: {“type”: “LoadBalancer”}}’

Now the argocd-server service has node ports.

$ kubectl -n argocd get svc argocd-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 10.109.209.54 <pending> 80:30112/TCP,443:30085/TCP 28m

Optionally, overwrite the randomly assigned node ports. For example, use 30080.

$ kubectl -n argocd edit svc argocd-server
service/argocd-server edited
$ kubectl -n argocd get svc argocd-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 10.109.209.54 <pending> 80:30080/TCP,443:30085/TCP 103m

In AWS console, edit the inbound rules for the corresponding security group to allow traffic to the node port.

Now Argo CD can be reached remotely using the 1st EC2 instance’s public IP and the node port. It is handy to add the public IP to /etc/hosts.

We can test the remote access via the Argo CD UI.

Download the Argo CD CLI again, to our laptop, from https://github.com/argoproj/argo-cd/releases. For a Mac user, the command line is similar to:

$ curl -Lo <somewhere-in-PATH>/argocd https://github.com/argoproj/argo-cd/releases/download/v2.6.7/argocd-darwin-amd64

Login to argocd-server with the CLI again, from our laptop.

$ argocd login argocd:30080
WARNING: server certificate had error: x509: certificate signed by unknown authority. Proceed insecurely (y/n)? y
Username: admin
Password:
‘admin:login’ logged in successfully
Context ‘argocd:30080’ updated

In AWS console, edit the inbound rules for the corresponding security group to allow traffic to 6443.

Copy the kubeconfig file of the 1st cluster out of the 1st EC2 instance to our laptop.

$ scp argocd:~/.kube/config ~/.kube/config_argocd
config 100% 5637 102.4KB/s 00:00

Edit the kubeconfig file to make sure the server address is using the public IP of the 1st EC2 instance, then try to access the apiserver.

$ vi ~/.kube/config_argocd
$ export KUBECONFIG=~/.kube/config_argocd
$ kubectl get ns argocd
NAME STATUS AGE
argocd Active 153m

Add the 2nd Cluster as Applications’ Destination

Identify the kubeconfig file (of the 2nd cluster) on the 2nd EC2 instance. We need this file to add the 2nd Kubernetes cluster to Argo CD.

Login to the 1st EC2 instance, and make the kubeconfig file (of the 2nd cluster) available there.

Add the 2nd cluster to Argo CD.

$ argocd cluster add kubernetes-admin@kubernetes — kubeconfig dest_kubeconfig.yaml 
WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `kubernetes-admin@kubernetes` with full cluster level privileges. Do you want to continue [y/N]? y
INFO[0003] ServiceAccount “argocd-manager” created in namespace “kube-system”
INFO[0003] ClusterRole “argocd-manager-role” created
INFO[0003] ClusterRoleBinding “argocd-manager-role-binding” created
INFO[0008] Created bearer token secret for ServiceAccount “argocd-manager”
Cluster ‘https://172.31.30.2:6443' added

Optionally, check the added cluster from the Argo CD UI.

Use the Argo CD Plugin

This section uses code in https://github.com/edge-experiments/argocd-scalability.

Install CRDs into the 2nd cluster.

$ kubectl apply -f plugin-usage/crds/edge.kyst.kube_configspecs.yaml 
customresourcedefinition.apiextensions.k8s.io/configspecs.edge.kyst.kube created
$ kubectl apply -f plugin-usage/crds/edge.kyst.kube_devicegroups.yaml
customresourcedefinition.apiextensions.k8s.io/devicegroups.edge.kyst.kube created

The Argo CD Plugin is served as a built image from Jun Duan’s Docker Hub account. We can check plugin-usage/argocd-repo-server-patch.yaml to identify the image.

image: docker.io/junatibm/wrap4kyst:latest

Patch argocd-repo-server against the 1st cluster.

$ kubectl patch -n argocd deployment argocd-repo-server — patch-file=plugin-usage/argocd-repo-server-patch.yaml
deployment.apps/argocd-repo-server patched

Give some time to argocd-repo-server for its restart.

$ kubectl -n argocd get po
NAME READY STATUS RESTARTS AGE
argocd-application-controller-0 1/1 Running 0 5h12m
argocd-applicationset-controller-57db5f5c7d-sh69z 1/1 Running 0 5h12m
argocd-dex-server-c4b8545d-tlggh 1/1 Running 0 5h12m
argocd-notifications-controller-7cddc64d84-b25tf 1/1 Running 0 5h12m
argocd-redis-6b7c6f67db-knpdp 1/1 Running 0 5h12m
argocd-repo-server-5c6bcd6ff4-fxrxm 1/1 Running 0 6m18s
argocd-server-64957744c9–7wcvq 1/1 Running 0 5h12m

Register the ‘wrap4kyst’ Argo CD plugin.

$ kubectl patch -n argocd configmap argocd-cm — patch-file plugin-usage/argocd-cm-patch.yaml
configmap/argocd-cm patched

Test the plugin as follows.

Create an Argo CD application.

$ argocd app create first-app \
— config-management-plugin wrap4kyst \
— repo https://github.com/edge-experiments/gitops-source.git \
— path kubernetes/guestbook/deploy \
— dest-server https://172.31.30.2:6443 \
— dest-namespace default
application ‘first-app’ created

Sync the Argo CD application.

$ argocd app sync first-app

Name: argocd/first-app
Project: default
Server: https://172.31.30.2:6443
Namespace: default
URL: https://argocd:30080/applications/first-app
Repo: https://github.com/edge-experiments/gitops-source.git
Target:
Path: kubernetes/guestbook/deploy
SyncWindow: Sync Allowed
Sync Policy: <none>
Sync Status: Synced to (31df591)
Health Status: Healthy

Operation: Sync
Sync Revision: 31df591a1ab16ff9cf218ab480d051aee0e8abb5
Phase: Succeeded
Start: 2023–03–29 16:30:36 -0400 EDT
Finished: 2023–03–29 16:30:36 -0400 EDT
Duration: 0s
Message: successfully synced (all tasks run)

GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE
edge.kyst.kube ConfigSpec default guestbook Synced configspec.edge.kyst.kube/guestbook created
edge.kyst.kube DeviceGroup default guestbook1 Synced devicegroup.edge.kyst.kube/guestbook1 created

Delete the Argo CD application.

$ argocd app delete first-app
Are you sure you want to delete ‘first-app’ and all its resources? [y/n] y
application ‘first-app’ deleted

To be ready for the next step (use Cluster Loader 2), we need to change ‘server’ in clusterloader-manifests/test-application-use-plugin.yaml. For example:

server: https://172.31.30.2:6443 # change this for each run

Use Cluster Loader 2

This section uses code in https://github.com/kubernetes/perf-tests.

The perf-tests repository contains multiple sub projects. We are going to use Cluster Loader 2.

$ cd clusterloader2/

Use a specific commit.

$ git checkout 71bfedf3b50f770f64453f537e91eb41452358dd
Previous HEAD position was 8a0c339a Merge pull request #2101 from jprzychodzen/perfdash-adc
HEAD is now at 71bfedf3 Merge pull request #2032 from wojtek-t/unify_pod_sizes_3

Insert one line at line #180, into pkg/test/simple_test_executor.go.

nsList = []string{“argocd”}

On the 2nd cluster, create a namespace for Argo CD Applications’ workload.

$ kubectl create ns argocd-scalability

Run Cluster Loader 2.

$ go run cmd/clusterloader.go \
— testconfig <path-to-the-argocd-scalability-repo>/clusterloader-manifests/config.yaml \
— provider local \
— kubeconfig ~/.kube/config_argocd \
— v=2 \
— enable-exec-service=false \
— delete-automanaged-namespaces=false

This creates 2k Argo CD Applications.

Setup Monitoring

This section uses code in https://github.com/prometheus-operator/kube-Prometheus.

Follow the Quickstart.

$ kubectl apply — server-side -f manifests/setup
$ kubectl wait \
— for condition=Established \
— all CustomResourceDefinition \
— namespace=monitoring
$ kubectl apply -f manifests/

Edit two locations in the ‘grafana’ service.

$ kubectl -n monitoring edit svc grafana

First, change its type to node port.

type: NodePort

Second, add a node port, say 30000, for http.

ports:
— name: http
nodePort: 30000
port: 3000
protocol: TCP
targetPort: http

In AWS console, edit the inbound rules for the corresponding security group to allow traffic to 30000.

Now we can access Grafana UI from port 30000.

Use admin/admin to login, and change the admin password.

Use Argo CD Dashboard in Grafana

This section uses code in https://github.com/edge-experiments/argocd-scalability.

Setup RBAC.

$ kubectl apply -f monitoring/rbac.yaml

Create service monitors for Argo CD.

$ kubectl apply -f monitoring/service-monitors.yaml

Import Argo CD Dashboard from monitoring/grafana-dashboards/ArgoCD-1659038021207.json.

Checkout the panels After a few minutes.

Trigger a One-shot Sync for all Applications

This section uses code in https://github.com/edge-experiments/gitops-source. As the name suggests, this repository holds the content to be delivered for Argo CD. We need to add new commits to the repository to trigger the sync. Therefore, it is strongly recommend to fork the repository and use the fork instead — unless you have the authors’ mobile numbers and call them to promptly merge your PR.

Remember to change the repoURL in clusterloader-manifests/test-application-use-plugin.yaml after the fork.

Push a commit to the main branch of the fork. The commit does not need to introduce changes to the to-be-delivered manifests. Even an empty commit can do the job.

Check the dynamics of the sync status from the Argo CD UI.

Observe the Grafana dashboard.

It takes roughly 7 minutes to sync 2k Apps in one-shot.

Create More Applications in Batches

This section uses code in https://github.com/edge-experiments/argocd-scalability and https://github.com/kubernetes/perf-tests.

In https://github.com/edge-experiments/argocd-scalability, there are tunable knobs (marked by ‘change this XXX for each run’ comment) in two files.

Tune knobs in clusterloader-manifests/config.yaml.

Change the number of Applications in one batch.

replicasPerNamespace: 2000 # change this number for each run

Edit ‘basename’ for Applications.

- basename: application-batch-1 # changes this basename for each run

Tune knobs in clusterloader-manifests/test-application-use-plugin.yaml.

Edit ‘batch’, which facilitates the management of Applications by label.

batch: “1” # changes this for each run

Change the size of workload of an Application, by specifying which registered Argo CD plugin to use.

name: wrap4kyst-scalability-heavily-loaded # changes this for each run

In https://github.com/kubernetes/perf-tests, use Cluster Loader 2 in the same way.

$ cd clusterloader2/

Run Cluster Loader 2, with the same command.

$ go run cmd/clusterloader.go \
— testconfig <path-to-the-argocd-scalability-repo>/clusterloader-manifests/config.yaml \
— provider local \
— kubeconfig ~/.kube/config_argocd \
— v=2 \
— enable-exec-service=false \
— delete-automanaged-namespaces=false

Change the Resync Period

This section uses code in https://github.com/edge-experiments/argocd-scalability.

The “resync period” is what we call the periodicity of application change monitoring. It is how frequently ArgoCD checks the Git repo for changes to the applications. This period is controlled by ArgoCD’s timeout.reconciliation configuration parameter.

Set the resync period to 6 minutes.

$ kubectl patch -n argocd configmap argocd-cm — patch-file plugin-usage/argocd-cm-patch.yaml

configmap/argocd-cm patched

The application controller needs a restart to pick up the new resync period.

Restart the applicationset controller by scaling replicas to 0 and then back to 1:

$ kubectl -n argocd scale — replicas=0 deploy argocd-applicationset-controller

deployment.apps/argocd-applicationset-controller scaled
$ kubectl -n argocd scale — replicas=1 deploy argocd-applicationset-controller

deployment.apps/argocd-applicationset-controller scaled

Sync 10,000 Argo CD Applications

It took about 40 minutes to sync 10k Applications.

Closing Remarks

This post was intended to show you the details of how you might recreate the experiments with scaling Argo CD described in our previous article. The experiments were motivated by our work on application lifecycle management of edge computing. Once again, this blog is part of a series of posts from the KubeStellar community regarding challenges related to multi-cloud and edge. You can learn more about the challenges and read posts from other members of the KubeStellar community on edge and multi-cloud topics in a post by Andy Anderson entitled Navigating the Edge: Overcoming Multi-Cloud Challenges in Edge Computing.

Thanks for reading!

FYI

Andy Anderson and Jun Duan will be speaking on this general subject at GitOpsCon on May 9, 2023!

Check out #ibmresearch #scale

--

--

Robert Filepp

Senior Software Engineer at IBM Thomas J. Watson Research Center. Postings are my own and don't necessarily represent IBM's positions, strategies or opinions.