Syncing Objects from one Kubernetes cluster to another Kubernetes cluster

11 min readMay 1, 2023

About KubeSteller Syncer developed in KubeSteller project

Introduction

Recently, Edge computing is becoming a more popular way for organizations to deploy and manage their IT infrastructure. Increasing the number of edge gateway and edge compute nodes, it becomes more complex to manage them due to the following restrictions unique to edge (referring to the blog posted by Andy Anderson.)

Disconnected Operations
Constrained Edge Clusters
Scalability with Large Numbers of Small Edge Clusters
Interoperability
Complexity of Deployments
Consistent Deployment Interfaces

We are challenging how to efficiently manage a large number of edge gateways and edge compute nodes while addressing these restrictions in KubeStellar project. Especially, we are focusing on such an edge cluster as Kuberetens is running on and we call this kind of edge “edge location” in this blog.

KubeStellar project

KubeStellar project consists of Edge-MultiCluster (Edge-MC) that is the code of platform and module sets for supporting management of a large number of edge locations. In the Edge-MC, we have adapted Kubernetes-based technologies because Kubernetes API machinery is a good foundation for creating reliable distributed systems. To manage the edge locations, we utilize Kubernetes-like Control Plane (KCP), which is a pure Kubernetes control plane where container runtime and deployment controller do not exist.

KCP can host multiple Kubernetes control planes and each control plane is called a “workspace”. First of all, we would like to show the rough schematic diagram in Figure 1. The full picture comes in a later section. In Users define workloads, each is a collection of Kubernetes API objects. Users create additional API objects that describe which workloads go where. There is a workspace for each edge cluster and the workloads are projected into the relevant workspaces. The workload in the workspace is synced to the associated edge location by KubeStellar Syncer. The KubeStellar Syncer also syncs back generated Kubernetes objects in edge location to the workspace. Edge-MC service provides a mechanism for users to be able to collect synced back objects.

**Figure 1. Schematic diagram of Edge-MC. Workload is a set of Kubernetes objects such as deployments and configmaps. Result is also a set of Kubernetes objects born in edge location.**

KubeStellar Syncer

To sync Kubernetes objects between workspace and edge location, we developed a new type of syncing agent “KubeStellar Syncer”. Since the edge location is Kubernetes and the workspace presents the same interface as a Kubernetes, KubeStellar Syncer can be generalized so that it’s not specific to KCP but can be used for syncing Kubernetes objects between any Kubernetes clusters.

Figure 2 shows what operations are required in KubeStellar Syncer. The requirements are to downsync workloads such as deployment and configmap from workspace to edge location downstream, upsync objects generated in edge location to workspace, and update workload statuses to the workspace.

**Figure 2. Required operations in KubeStellar Syncer**

How to use KubeStellar Syncer

Let’s briefly introduce the developed KubeStellar Syncer. The KubeStellar Syncer runs in a regular Kubernetes Deployment in the edge location. The minimum required parameters are connection information for both Kubernetes clusters (a.k.a. kubeconfig). We developed a command line tool to generate the manifest yaml to bootstrap KubeStellar Syncer in https://github.com/yana1205/kcp.

The command line tool creates the identity and authorization on a workspace that you want to sync, and produces yaml manifests to install KubeStellar Syncer on the downstream Kubernetes cluster. The command line tool runs with a kubeconfig that manipulates the workspace as follows.

KUBECONFIG=workspace.kubeconfig kubectl kcp workload edge-sync cluster1 --syncer-image quay.io/kcpedge/syncer:dev-2023-04-18 -o edge-symcer.yaml

KubeStellar Syncer is installed by applying the generated manifests (edge-syncer.yaml) to the downstream Kubernetes cluster.

KUBECONFIG=edge-location.kubeconfig kubectl apply -f edge-symcer.yaml

KubeStellar Syncer watches the SyncerConfig object in the workspace cluster, which describes what resources are downsynced and what resources are upsynced. Here is the sample SyncerConfig.

apiVersion: edge.kcp.io/v1alpha1
kind: SyncerConfig
metadata:
  name: syncer-config
spec:
  namespaceScope:
    namespaces:
    - ns1
    resources:
    - apiVersion: v1
      group: ""
      resource: configmaps
    - apiVersion: v1
      group: apps
      resource: deployments
  clusterScope:
  - apiVersion: v1
    group: apiextensions.k8s.io
    resource: customresourcedefinitions
    objects:
    - samples.my.domain
  upsync:
  - apiGroup: my.domain
    resources:
    - samples
    namespaces:
    - ns1

KubeStellar Syncer interprets it as follows:

Downsync any configmaps, and deployments in namespace ns1, and CRD named samples.my.domain from upstream to downstream
Upsync any samples (samples.my.domain) in namespace ns1 from downstream to upstream

Once this SyncerConfig is deployed to the upstream cluster, these resource objects will be synced.

For example, let’s try to deploy this workload manifest to the upstream cluster.

KUBECONFIG=upstream.kubeconfig kubectl apply -f - << EOF
apiVersion: v1
kind: Namespace
metadata:
  name: ns1
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: sample-configmap
  namespace: ns1
data:
  id: abc
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: sample-deploy
  namespace: ns1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
      - name: busybox
        image: busybox
        args: ["tail", "-f", "/dev/null"]
      terminationGracePeriodSeconds: 3
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: samples.my.domain
spec:
  group: my.domain
  names:
    kind: Sample
    listKind: SampleList
    plural: samples
    singular: sample
  scope: Namespaced
  versions:
  - name: v1alpha1
    schema:
      openAPIV3Schema:
        properties:
          apiVersion:
            type: string
          kind:
            type: string
          metadata:
            type: object
          spec:
            properties:
              foo:
                type: string
            type: object
        type: object
    served: true
    storage: true
---
EOF

Let’s check them getting downsynced to downstream

$ KUBECONFIG=downstream.kubeconfig kubectl get cm,deploy,crd,sample -n ns1

NAME                         DATA   AGE
configmap/kube-root-ca.crt   1      1s
configmap/sample-configmap   1      1s

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/sample-deploy   0/1     1            0           1s

NAME                                                              CREATED AT
customresourcedefinition.apiextensions.k8s.io/samples.my.domain   2023-04-21T06:55:22Z

And, let’s try out the upsync feature by deploying a sample CR to downstream.

KUBECONFIG=downstream.kubeconfig kubectl apply -f - << EOL
apiVersion: my.domain/v1alpha1
kind: Sample
metadata:
  name: sample
  namespace: ns1
spec:
  foo: bar
EOL

The sample CR will be upsynced to upstream.

$ KUBECONFIG=upstream.kubeconfig kubectl sample -n ns1
NAME                      AGE
sample.my.domain/sample   50s

KubeStellar Syncer in Edge-MC

So far we have talked one-to-one sync. This can be extended to one-to-many sync by Edge-MC platform. Edge-MC platform is an infrastructure to manage a large number of edge locations. Using the Edge-MC platform, you can easily distribute workloads to multiple clusters and collect the results from them.

In Edge-MC, instead of creating SyncerConfig in the previous section, EdgePlacement is used. The EdgePlacement describes what Kubernetes resources are downsynced and upsynced, and what edge location they should go. Once the user deploys workloads with EdgePlacement object to a workspace, then Edge-MC automatically distributes the workload to multiple edge locations. We’ll briefly introduce what Edge-MC does behind the scene.

Figure 3 is the detailed diagram about how Edge-MC distributes workloads to each workspace with EdgeSyncer. In Edge-MC, there are two major kinds of workspaces. One is workload management workspace and the other is mailbox workspace. Users deploy workloads and EdgePlacement to workload management workspace. Edge-mc maintains the SyncerConfig objects that are derived from the EdgePlacement objects and workload objects. Edge-MC finally distributes the selected workloads with SyncerConfig to workspace synced with edge location by EdgeSyncer. That workspace is the mailbox workspace.

**Figure 3. The detail diagram about how Edge-MC distributes workloads to each workspace**

Edge-MC use case: Automate compliance check on multiple clusters

Leveraging Edge-MC, you can automate compliance check of multiple clusters. In Kubernetes world, there are tools like Kyverno, OPA Gatekeeper, and Compliance Operator. Basic usage of these tools is to install the tools by Helm and deploy their own Kubernetes resource objects called “policy” in the cluster. Then, the tool starts to scan, continually audits the Kubernetes cluster, and generates reports. Isn’t Edge-MC well fitting with this use case? I will show a short introduction.

Now let’s assume that we have 2 edge locations, both of which are installed with KubeStellar Syncer and associated with a mailbox workspace, following by Edge-MC tutorial. In the shell commands in all the following steps it is assumed that kcp is running and $KUBECONFIG is set to the .kcp/admin.kubeconfig that kcp produces, except where explicitly noted that physical clusters is being accessed. We want to install Kyverno on both edge locations and deploy Kyverno policies to both locations. Figure 4 is the diagram of that.

Figure 4. Enable Kyverno and deploy Kyverno policies on multiple edge locations by Edge-MC

Firstly, we deploy Kyverno install manifests and Kyverno policies as workload to workload management workspace. The Kyverno install manifests are deployed by Helm so we utilize it as follows.

$ kubectl ws root:wmw
Current workspace is "root:wmw".

$ helm install kyverno --set replicaCount=1 --namespace kyverno --create-namespace kyverno/kyverno
NAME: kyverno
LAST DEPLOYED: Fri Apr 21 17:11:12 2023
NAMESPACE: kyverno
STATUS: deployed
REVISION: 1
NOTES:
Chart version: 2.6.5
Kyverno version: v1.8.5

You can check Kubernetes objects are deployed to the workspace.

$ kubectl get ns,deploy,crd -n kyverno
NAME                STATUS   AGE
namespace/default   Active   55s
namespace/kyverno   Active   28s

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kyverno   0/1     0            0           28s

NAME                                                                                    CREATED AT
customresourcedefinition.apiextensions.k8s.io/admissionreports.kyverno.io               2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/backgroundscanreports.kyverno.io          2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/clusteradmissionreports.kyverno.io        2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/clusterbackgroundscanreports.kyverno.io   2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/clusterpolicies.kyverno.io                2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/clusterpolicyreports.wgpolicyk8s.io       2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/generaterequests.kyverno.io               2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/policies.kyverno.io                       2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/policyreports.wgpolicyk8s.io              2023-04-24T09:19:07Z
customresourcedefinition.apiextensions.k8s.io/updaterequests.kyverno.io                 2023-04-24T09:19:07Z

Secondly, we have to create EdgePlacement so that these resources get delivered to edge locations. All resources to be deployed by Helm are obtained by “--dryrun” option in Helm command. We use the go program to extract the resource lists and create EdgePlacement against the output of the “ — dryrun” as follows.

git clone -b helmconverter https://github.com/yana1205/edge-mc.git
cd edge-mc
helm template kyverno --set replicaCount=1 --namespace kyverno --create-namespace kyverno/kyverno --dry-run --debug > /tmp/kyverno-helm-install.yaml
go run cmd/syncer/helm/converter.go --path-to-helm-template /tmp/kyverno-helm-install.yaml

apiVersion: edge.kcp.io/v1alpha1
kind: EdgePlacement
metadata:
  name: edge-placement
spec:
  locationSelectors:
  - matchLabels: {"env":"prod"}
  namespaceSelector:
    matchLabels: {"name":"kyverno"}
  nonNamespacedObjects:
    - apiGroup: apiextensions.k8s.io
      resources:
      - customresourcedefinitions
      resourceNames:
      - admissionreports.kyverno.io
      - backgroundscanreports.kyverno.io
      - clusteradmissionreports.kyverno.io
      - clusterbackgroundscanreports.kyverno.io
      - clusterpolicies.kyverno.io
      - clusterpolicyreports.wgpolicyk8s.io
      - generaterequests.kyverno.io
      - policies.kyverno.io
      - policyreports.wgpolicyk8s.io
      - updaterequests.kyverno.io
    - apiGroup: rbac.authorization.k8s.io
      resources:
      - clusterroles
      resourceNames:
      - kyverno:admin-policies
      - kyverno:admin-policyreport
      - kyverno:admin-reports
      - kyverno:admin-generaterequest
      - kyverno:admin-updaterequest
      - kyverno
      - kyverno:userinfo
      - kyverno:policies
      - kyverno:view
      - kyverno:generate
      - kyverno:events
      - kyverno:webhook
    - apiGroup: rbac.authorization.k8s.io
      resources:
      - clusterrolebindings
      resourceNames:
      - kyverno
    - apiGroup: kyverno.io
      resources:
      - clusterpolicies
      resourceNames:
      - "*"
    - apiGroup: apis.kcp.io
      resources:
      - apibindings
      resourceNames:
      - "bind-kube"
  upsync:
  - apiGroup: wgpolicyk8s.io
    resources:
    - policyreports
    - clusterpolicyreports
    namespaces:
    - "*"
    names:
    - "*"

This covers all resources for Kyverno. The detail spec is described in EdgePlacement schema.

Once it’s deployed to the workload management workspace, the workloads are started to be distributed.

Let’s check if the Kyverno is installed in each edge location or not.

$ KUBECONFIG=~/.kube/config kubectl get --context kind-florin pod -n kyverno
NAME                       READY   STATUS    RESTARTS   AGE
kyverno-7c444878f7-zlzdp   1/1     Running   0          93s

$ KUBECONFIG=~/.kube/config kubectl get --context kind-guilder pod -n kyverno
NAME                       READY   STATUS    RESTARTS   AGE
kyverno-7c444878f7-rlsp8   1/1     Running   0          2m18s

Now Kyverno is enabled in each edge location. Let’s deploy the Kyverno policy to the workload management workspace to apply the policy to each edge location.

kubectl apply -f - << EOL
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: sample-cluster-policy
spec:
  background: true
  validationFailureAction: enforce
  rules:
  - name: sample-cluster-check-for-labels-in-configmap
    match:
      any:
      - resources:
          kinds:
          - ConfigMap
    validate:
      message: "label 'app.kubernetes.io/name' is required"
      pattern:
        metadata:
          labels:
            app.kubernetes.io/name: "?*"
EOL

Let’s check if the policy is deployed to edge locations.

$ KUBECONFIG=~/.kube/config kubectl get --context kind-florin clusterpolicies 
NAME                    BACKGROUND   VALIDATE ACTION   READY
sample-cluster-policy   true         enforce           true

$ KUBECONFIG=~/.kube/config kubectl get --context kind-guilder clusterpolicies
NAME                    BACKGROUND   VALIDATE ACTION   READY
sample-cluster-policy   true         enforce           true

It’s done. Let’s see the reports generated by Kyverno too.

$ KUBECONFIG=~/.kube/config kubectl get --context kind-florin policyreports -A
NAMESPACE                          NAME                         PASS   FAIL   WARN   ERROR   SKIP   AGE
default                            cpol-sample-cluster-policy   0      1      0      0       0      3m16s
kcp-edge-syncer-florin-1zcapm34    cpol-sample-cluster-policy   0      1      0      0       0      3m16s
kube-node-lease                    cpol-sample-cluster-policy   0      1      0      0       0      3m16s
kube-public                        cpol-sample-cluster-policy   0      2      0      0       0      3m16s
kube-system                        cpol-sample-cluster-policy   0      6      0      0       0      3m16s
kyverno                            cpol-sample-cluster-policy   2      1      0      0       0      3m16s
local-path-storage                 cpol-sample-cluster-policy   0      2      0      0       0      3m16s

$ KUBECONFIG=~/.kube/config kubectl get --context kind-guilder policyreports -A
NAMESPACE                          NAME                         PASS   FAIL   WARN   ERROR   SKIP   AGE
default                            cpol-sample-cluster-policy   0      1      0      0       0      3m36s
kcp-edge-syncer-guilder-1yenui3z   cpol-sample-cluster-policy   0      1      0      0       0      3m36s
kube-node-lease                    cpol-sample-cluster-policy   0      1      0      0       0      3m36s
kube-public                        cpol-sample-cluster-policy   0      2      0      0       0      3m36s
kube-system                        cpol-sample-cluster-policy   0      6      0      0       0      3m36s
kyverno                            cpol-sample-cluster-policy   2      1      0      0       0      3m36s
local-path-storage                 cpol-sample-cluster-policy   0      2      0      0       0      3m36s

Let’s check if these reports are upsynced to mailbox workspaces. This needs several steps.

Find to mailbox workspaces

$ kubectl ws root:espw
Current workspace is "root:espw".
$ kubectl get Workspace -o "custom-columns=NAME:.metadata.name,SYNCTARGET:.metadata.annotations['edge\.kcp\.io/sync-target-name']"
NAME                                           SYNCTARGET
root-mb-89b4f0d7-b6fc-4ff1-89c2-2ee59d790df6   florin
root-mb-9ad9315f-4975-4f80-a261-19b49acef2e0   guilder

2. Go to either workspace

$ kubectl ws root-mb-89b4f0d7-b6fc-4ff1-89c2-2ee59d790df6
Current workspace is "root:espw:root-mb-89b4f0d7-b6fc-4ff1-89c2-2ee59d790df6" (type root:universal).

3. View policy reports

$ kubectl get policyreports -A
NAMESPACE                         NAME                         PASS   FAIL   WARN   ERROR   SKIP   AGE
default                           cpol-sample-cluster-policy   0      1      0      0       0      5m10s
kcp-edge-syncer-florin-1zcapm34   cpol-sample-cluster-policy   0      1      0      0       0      5m10s
kube-node-lease                   cpol-sample-cluster-policy   0      1      0      0       0      5m10s
kube-public                       cpol-sample-cluster-policy   0      2      0      0       0      5m10s
kube-system                       cpol-sample-cluster-policy   0      6      0      0       0      5m10s
kyverno                           cpol-sample-cluster-policy   2      1      0      0       0      5m10s
local-path-storage                cpol-sample-cluster-policy   0      2      0      0       0      5m10s

4. They are there. Let’s check another mailbox too.

$ kubectl ws root:espw
Current workspace is "root:espw".
$ kubectlws root-mb-9ad9315f-4975-4f80-a261-19b49acef2e0
Current workspace is "root:espw:root-mb-9ad9315f-4975-4f80-a261-19b49acef2e0" (type root:universal).
$ kubectl get policyreports -A
NAMESPACE                          NAME                         PASS   FAIL   WARN   ERROR   SKIP   AGE
default                            cpol-sample-cluster-policy   0      1      0      0       0      5m30s
kcp-edge-syncer-guilder-1yenui3z   cpol-sample-cluster-policy   0      1      0      0       0      5m30s
kube-node-lease                    cpol-sample-cluster-policy   0      1      0      0       0      5m30s
kube-public                        cpol-sample-cluster-policy   0      2      0      0       0      5m30s
kube-system                        cpol-sample-cluster-policy   0      6      0      0       0      5m30s
kyverno                            cpol-sample-cluster-policy   2      1      0      0       0      5m30s
local-path-storage                 cpol-sample-cluster-policy   0      2      0      0       0      5m30s

Perfect!

This is one of the Compliance-To-Policy project activities, where the above processes are automated by C2P operator. If you are interesting in it, feel free to contact me or check out Compliance-To-Policy. We also have an incredible movie prepared for an end-to-end demonstration. I highly encourage you to check it out!

Continuous compliance checking by Kyverno seamlessly spanning multi-clusters with KubeStellar

Conclusion

In this blog post, we discuss what’s the KubeStellar Syncer and how it’s utilized behind the scenes of Edge-MC platform for a large number of Edge locations.

The KubeStellar Syncer has been actively developed and we will continue to share our activity and the results in future blogs.

This blog is part of a series of posts from the KubeStellar community regarding challenges related to multi-cloud and edge. You can learn more about the challenges and read posts from other members of the KubeStellar community on edge and multi-cloud topics: Navigating the Edge: Overcoming Multi-Cloud Challenges in Edge Computing with KubeStellar (by Andy Anderson); Seven Ways to Stub Your Toes on The Edge. (by Mike Spreitzer); Toward Building a Kubernetes Control Plane for the Edge (by Paolo Dettori). You can also join our community by attending our bi-weekly meetings.