Running your management plane with OKE, Verrazzano and Cluster API — Part 1

Ali Mukadam
Verrazzano
Published in
12 min readJul 26, 2023

--

Well, Verrazzano 1.6 is out and with it comes a number of exciting new features:

In this article, I want to talk about CAPI. If you’re new to CAPI on OCI, I suggest you take a quick detour:

As I’ve not written about CAPI in the context of Verrazzano previously, let’s remedy that immediately.

Verrazzano profiles

Before we start, I want to clarify a few terms and how things interact with each other. Let’s start with Verrazzano profiles and there are 4 of them:

  • dev: sets up all the things you need to be productive when developing your cloud native application. The dev profile is configured by default to use less resources and is easy to setup and teardown. That being said, you can turn the various components you don’t need off and vice-versa. The benefit of this is that it’s not too costly, it’s fast to set up and fast to destroy.
  • prod: the production profile has sane defaults and includes all the components typically expected in a production environment e.g. Prometheus, OpenSearch, Grafana etc. Think of the prod profile as a standalone, completely isolated cluster which require its own CI/CD, observability etc. You would deploy each cluster with the prod profile if for the purpose of compliance or regulation, you require completely isolated teams managing the different clusters.
  • prod: The other way to use the prod profile is when you’ll be running multiple Kubernetes clusters and thus the prod profile will act more like a management plane. Its main purpose in this case is to run all the operational components but this doesn’t necessarily preclude you from running your workloads here either but you need to make a conscious decision about it. In a multi-cluster context, we call this an Admin cluster and it maintains a registry of clusters, helps you simplify your operations, provide a global view of all your connected clusters as well as allows the connected clusters to function as a part of a global mesh. This makes it possible to deploy more complex topologies to meet performance, compliance and other security requirements.
Admin cluster
  • managed-cluster: Deploying the prod profile with all the components, even if you could turn those that you need off, feels heavy. In its stead, I present to you the managed profile which focuses on running the application workload and offloads most of the operation components to the Admin cluster with which it registers itself. This allows your workload clusters to function both as isolated or connected clusters, improves the resilience of your application infrastructure and reduce the time to upgrade the workload clusters since it has less components to upgrade.
Verrazzano managed cluster
  • none: runs almost nothing and you can pick and choose the components you want to run here. It’s suitable for edge environment where compute, storage and memory are at a premium or when you have existing components in a cluster that you had previously installed yourself e.g. many customers are already using cert-manager, or Keycloak and they don’t want to overwrite their existing instances. In other words, you know what you are doing.

The CAPI components will run in the management plane. Effectively, the Admin cluster is equivalent to a CAPI management cluster and a managed cluster is equivalent to a workload cluster. As I write this, it occurs to me that perhaps it would be nice if we can align the terminologies but that’s a discussion for another day. So, for now, I’ll use these respective terms interchangeably.

Provisioning an Admin cluster

We’ll use the Terraform module for Verrazzano to provision an OKE cluster which you’ll use as an Admin Verrazzano instance. I should mention we’ve improved the Terraform module quite significantly:

  • it uses the new 5.x branch from the OKE module. v5 of the OKE module has not been released yet but I thought it would be a good exercise to start using the 5.x to understand the issues, especially documentation ones that users might face before the release.
  • documentation now uses GitHub pages and accessible: https://oracle-terraform-modules.github.io/terraform-oci-verrazzano/
  • Editable commands before copy. So, when you follow the guide, say for multi-cluster setup, you can change the default phoenix to your own e.g. london on the docs site itself before copying to the clipboard and pasting.
  • selective enablement of components which allow for the installation to proceed much faster
  • support for new components e.g. Thanos, CAPI etc.
  • better error handling
  • connectivity of clusters either in star or mesh architecture depending on your workload
  • experimental cross-cluster Istio configuration

We have a lot of other things planned but if you have ideas or requirements, we are all ears.

Navigate to the single production cluster page and follow the instructions to create an OKE cluster first. Edit the terraform.tfvars to ensure that both Argo CD and Cluster API are enabled:

argocd                = true
cluster_api = true

When Verrazzano is ready, retrieve the “verrazzano” user password:

bash ~/bin/vz_access.sh

You can use the printed password to login via SSO.

Configuring Cluster API to provision OKE

Before we can provision an OKE cluster via CAPI, we need to configure the authentication to allow CAPI to do that. First, generate the base64 encoded values so you can put them in a secret:

export OCI_TENANCY_ID=<insert-tenancy-id-here>
export OCI_USER_ID=<insert-user-ocid-here>
export OCI_REGION=<inset-region-key-here>
export OCI_CREDENTIALS_FINGERPRINT=<insert-fingerprint-here>

export OCI_TENANCY_ID_B64="$(echo -n "$OCI_TENANCY_ID" | base64 | tr -d '\n')"
export OCI_USER_ID_B64="$(echo -n "$OCI_USER_ID" | base64 | tr -d '\n')"
export OCI_REGION_B64="$(echo -n "$OCI_REGION" | base64 | tr -d '\n')"
export OCI_CREDENTIALS_FINGERPRINT_B64="$(echo -n "$OCI_CREDENTIALS_FINGERPRINT" | base64 | tr -d '\n')"
export OCI_CREDENTIALS_KEY_B64=$(base64 < <insert-path-to-api-private-key-file-here> | tr -d '\n')

We’ll first define the secret in capi-secret.yaml to hold these values:

apiVersion: v1
kind: Secret
metadata:
name: capi-oke-credentials
namespace: verrazzano-capi
type: Opaque
data:
tenancy: ${OCI_TENANCY_ID_B64}
user: ${OCI_USER_ID_B64}
region: ${OCI_REGION_B64}
key: ${OCI_CREDENTIALS_KEY_B64}
fingerprint: ${OCI_CREDENTIALS_FINGERPRINT_B64}

We can then create the secret without hard coding the values:

envsubst < capi-secret.yaml | kubectl apply -f -

Next, define a Custom Resource OCIClusterIdentity which uses the credentials you just created:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: OCIClusterIdentity
metadata:
name: oke-cluster-identity
namespace: verrazzano-capi
spec:
type: UserPrincipal
principalSecret:
name: capi-oke-credentials
namespace: verrazzano-capi
allowedNamespaces: {}

Since you have already created the secret, you can now create the Cluster Identity:

kubectl apply -f capi-identity.yaml

We are now ready to ask CAPI to create a workload cluster with OKE.

Creating a workload OKE cluster using Argo CD and Cluster API

The diagram below illustrates the sequence of steps that will take place that will result in the creation of an OKE cluster.

Creating a cluster via Argo CD

First, let’s define our cluster. Notice the following below:

  • the cluster CR references the OCIClusterIdentity you created in the previous section.
  • you must provide a region identifier and a compartment id where you want the cluster to be created.
  • there is a default networking configuration in-built but just in case you have constraints in your environment e.g. maybe you are required to adhere to some CIDR ranges, you can do that too.
  • you can include DRG (dynamic routing gateway) creation if you need to reach your OKE cluster either via VPN from on-premise/a hub VCN or set up a multi-cluster configuration (as we’ll do in the next post)

We can now also define the node pools that will run our applications:

Once defined, check the code into a GitHub repo.

export OCI_COMPARTMENT_ID="ocid1....."
export OCI_CLUSTER_REGION="ap-melbourne-1"
envsubst < cluster.yaml > workloadcluster.yaml

git push <repo>

Next, we’ll delegate the silky task of creating the OKE cluster to Argo CD. Obtain the URL of your Argo CD from Verrazzano.

vz status

Verrazzano Status
Name: admin
Namespace: default
Profile: prod
Version: 1.6.1
State: Ready
Available Components: 16/16
Access Endpoints:
argoCDUrl: https://argocd.tr-admin.192.9.181.82.nip.io
keyCloakUrl: https://keycloak.tr-admin.192.9.181.82.nip.io
rancherUrl: https://rancher.tr-admin.192.9.181.82.nip.io

Obtain the verrazzano user password. On the operator host, we’ve added a utility script to retrieve this for you:

vz_access.sh
verrazzano user password: abc123def456

On Argo’s login page, click on login via Keycloak. You’ll be redirected to the SSO page.

Login using the verrazzano user and the retrieved password from above.

If your repo is private, navigate to Settings > Repositories and click on “CONNECT REPO” and follow the instructions. It would be a good idea for you to create a GitHub Token that you can use to connect Argo CD with your repo. After you connect, your repo connection status should be “Successful”. Point your application to your GitHub repo, configure the right branches and path if any and you should get an Argo CD application like the following :

Click on Sync to start the creation of the OKE cluster:

We can verify the OKE cluster is being created in OCI Console:

Finally, the cluster will be ready with the node pools too:

Scaling the OKE workload cluster using ArgoCD and Cluster API

Let’s say we want to scale the OKE workload cluster. There are 2 ways we can do that:

  1. by increasing the number of worker nodes in the existing node pool
  2. by adding an additional node pool.

Is it too much to ask for both? (Tony Stark dixit)

Scaling the OKE workload cluster via CAPI

To increase the size of the existing node pool, edit node pool definition and increase the value of replicas:

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
name: mp1
namespace: verrazzano-capi
spec:
clusterName: vz-oke
replicas: 5
template:
spec:
bootstrap:
dataSecretName: ""
clusterName: vz-oke
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: OCIManagedMachinePool
name: mp-1
version: v1.25.4
---

To add a new node pool, make a copy of the MachinePool and OCIManagedMachinePool objects and edit them so you can change their names and if you need to, also change their shapes :

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
name: mp2
namespace: verrazzano-capi
spec:
clusterName: vz-oke
replicas: 3
template:
spec:
bootstrap:
dataSecretName: ""
clusterName: vz-oke
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: OCIManagedMachinePool
name: mp-2
version: v1.25.4
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: OCIManagedMachinePool
metadata:
name: mp-2
namespace: verrazzano-capi
spec:
nodePoolNodeConfig:
nodePoolPodNetworkOptionDetails:
cniType: "FLANNEL_OVERLAY"
nodeMetadata:
user_data: IyEvYmluL2Jhc2gKY3VybCAtLWZhaWwgLUggIkF1dGhvcml6YXRpb246IEJlYXJlciBPcmFjbGUiIC1MMCBodHRwOi8vMTY5LjI1NC4xNjkuMjU0L29wYy92Mi9pbnN0YW5jZS9tZXRhZGF0YS9va2VfaW5pdF9zY3JpcHQgfCBiYXNlNjQgLS1kZWNvZGUgPi92YXIvcnVuL29rZS1pbml0LnNoCnByb3ZpZGVyX2lkPSQoY3VybCAtLWZhaWwgLUggIkF1dGhvcml6YXRpb246IEJlYXJlciBPcmFjbGUiIC1MMCBodHRwOi8vMTY5LjI1NC4xNjkuMjU0L29wYy92Mi9pbnN0YW5jZS9pZCkKYmFzaCAvdmFyL3J1bi9va2UtaW5pdC5zaCAtLWt1YmVsZXQtZXh0cmEtYXJncyAiLS1wcm92aWRlci1pZD1vY2k6Ly8kcHJvdmlkZXJfaWQiCg==
nodeShape: VM.Standard.E4.Flex
nodeShapeConfig:
ocpus: "4"
memoryInGBs: "64"
nodeSourceViaImage:
bootVolumeSizeInGBs: 150
sshPublicKey: ""
version: v1.25.4
---

Once done, check-in the code in GitHub and trigger a sync in ArgoCD.

Upgrading the OKE cluster control plane using ArgoCD and Cluster API

Edit the the OCIManagedControlPlane object, change the version to a more recent Kubernetes version available in OKE, check it in GitHub and let ArgoCD and CAPI do the job for you:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: OCIManagedControlPlane
metadata:
name: vz-oke
namespace: verrazzano-capi
spec:
version: v1.26.2
clusterPodNetworkOptions:
- cniType: "FLANNEL_OVERLAY"

The new version you specify must be one that is available in OKE.

Upgrading the OKE cluster worker nodes

At this point, you might be asking why not upgrade both the control plane and workers at the same time. That’s because it’s not OKE’s upgrade procedure. OKE’s upgrade procedure is to upgrade the control plane first, followed by the worker nodes. This allows your applications to continue smoothly even though the control plane is being upgraded. The other reason is that it also gives you a choice of upgrade methods:

  • in-place upgrade: keep existing workers but upgrade them to use a corresponding version of the OKE control plane after the latter has been upgraded.
  • out-of-place upgrade: involves provisioning new worker nodes first, and waiting for them to be ready. When new worker nodes are provisioned after an upgrade, they’ll automatically use the control plane’s new version. You can then drain the pods and cordon off the worker nodes that are still on the previous version. You would then delete the older worker nodes.

To upgrade the Kubernetes version of the worker nodes using the in-place method, change the Kubernetes version in the MachinePool and OCIManagedMachinePool objects:

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
name: mp1
namespace: verrazzano-capi
spec:
.
.
name: mp-1
version: v1.26.2
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: OCIManagedMachinePool
metadata:
name: mp-1
namespace: verrazzano-capi
spec:
.
.
.
version: v1.26.2
---

The new version you specify must be one that is available in OKE. As before, check the code in GitHub, sync and let Argo and Cluster API handle the upgrade for you.

There are benefits and drawbacks to both and as a lot of things in life, you cannot have your cake and eat it too. So, let’s take a look them and when you should consider each.

In-place vs out-of-place upgrades

I use the words “benefits” and “drawbacks” here more for language purposes than to describe inherent weaknesses in the respective methods themselves. Strictly speaking, I don’t think of them as benefits and drawbacks but more like tradeoffs that you need to make depending on your situation.

In-place upgrade is a good approach to consider if you are very, very conscious about costs, even at the expense of other criteria. This saves you the cost and trouble of duplicating, however temporarily, your worker nodes or part of them. It can also be a consideration if you’re running a large OKE cluster and the size of your cluster will not let you duplicate the worker nodes for the purpose of upgrade because of service limits (if you have such kind of issues, come and talk to us and we would love to help).

In other situations, you might be still be running a cluster that allows you to double in terms of worker nodes but the shape of those nodes are relatively more expensive than standard VMs e.g. large bare metals, GPUs etc. and doubling these, even for a short period is prohibitive. So, in such situations, it works to your advantage that you can upgrade your cluster using the in-place method.

However, there is always the risk that you decided to upgrade at full moon or on Friday the 13th after you broke a mirror the morning of upgrade day, and the Kubernetes upgrade on some nodes fail. While this does not affect the OKE cluster itself, it can have an impact on your application.

In contrast, the out-of-place method is considerably less risky:

  1. Provision new node pools and wait for them to be ready
  2. Drain pods from older pools and cordon them
  3. Let Kubernetes re-schedule the pods on newer pools
  4. Delete the old node pools

You would typically wait until all your new node pools are ready before starting to drain the pods. However, there are variations to this that you can consider but it requires a bit of planning and discipline.

I’ve written about using specialized node pools before and what you could do is run different parts of your application on different node pools by using labels and nodeSelectors. Doing so allows you to move your pods to newer nodes in a phased approach. This has the further benefit of keeping the costs down while ensuring the infrastructure upgrade is least disruptive as possible to the infrastructure.

Besides not being vulnerable to failed Kubernetes upgrades on the worker nodes, you would also consider the out-of-place method if dependencies exist in your application and these are reflected in the order in which you have to deploy your pods. In this way, you can plan your upgrade and pods draining to happen in a deterministic manner before ultimately retiring the nodes on older version.

Summary

Verrazzano 1.6 comes with a number of new features and among them is better integration with OKE using Cluster API. Combining Cluster API and ArgoCD allows us to use GitOps principles to manage OKE clusters. This allows us to create, scale, upgrade and ultimately retire OKE clusters.

Before concluding this article on how to use Verrazzano, ArgoCD and Cluster API to provision OKE clusters, I would like to thank my colleagues Shyam Radakrishnan and Abhishek Mitra for their considerable help when writing this article. Stay tuned for the 2nd part.

--

--