The Cloud and Kube are strong with OCI

Ali Mukadam
Oracle Developers
Published in
7 min readJun 24, 2022

--

Oracle’s Q4 financial results are out and cloud revenues are up. A whole set of new products have recently been released, especially the smaller OCI Dedicated Regions at lower price point. Yeah, we’ve been busy. If you missed the announcement, you can watch it on demand:

Amid all this excitement, it’s easy to lose track of the recent cloud native development in OCI. Fear not, I’ve got you covered.

Support for using Oracle Linux 8 for worker nodes

We kicked off at the beginning of the year by adding the ability to use Oracle Linux 8 when creating node pools for clusters running Kubernetes 1.20 and above. Oracle Linux 8 supports FIPS, a set of standards and guidelines for federal computer systems. Generally though, Oracle Linux 8 is leaner and meaner which means it boots a tad faster and therefore your node pools become available more quickly.

Support custom cloud-init script for worker nodes

We then added the ability to customize cloud-init scripts in your node pools. Until recently, you’ve been able to use cloud-init for the normal compute instances but you couldn’t customize the underlying cloud-init scripts for the worker nodes. They were there but we didn’t expose them to you. Now we have and you can customize the node pools to suit your needs.

You can use the cloud-init scripts to configure additional kubelet arguments, configure your SELinux policy, install your organization’s mandated antivirus and other security tool. Importantly, you can also have nodepool-specific scripts. In the terraform-oci-oke module, we include a default script common to all node pools. It also allows you to set your preferred time zones and if you set a higher boot volume in your node pool parameter, it will automatically configure that too. If you’ve got a large node pool, sshing to all of them and running oci-growfs to expand the boot volume to its configured size was not fun. Instead, we use cloud-init to do the hard work for you.

You can also override the default common script:

cloudinit_nodepool_common = "/tmp/commoncloudinit.sh"

Or set a specific script per nodepool:

cloudinit_nodepool = {  
np1 = "/tmp/np1cloudinit.sh"
np3 = "/tmp/np3cloudinit.sh"
}

Storage enhancements

We then followed up with a couple of storage enhancements for OKE:

  • Adding encryption of worker nodes boot volumes and Kubernetes Persistent Volumes (PVs) backed by the OCI Block Storage service. This allows customers to meet their compliance and security standards by using encryption with a key stored in OCI Vault.
  • Adding official support for Persistent Volumes (PVs) backed by OCI File Storage service (FSS). This has been a long time coming. We’ve had the blog post by Prasad that’s been serving us for quite a while but it required the manual creation of a Storage Class for FSS. The new mechanism uses CSI so defining a new storage class is not required. All you need to do is specify the driver. Additionally, you can also use it to encrypt the data in-transit and at-rest with FSS.
apiVersion: v1
kind: PersistentVolume
metadata:
name: fss-pv
spec:
capacity:
storage: 50Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
csi:
driver: fss.csi.oraclecloud.com
volumeHandle: ocid1.filesystem.oc1.iad.aaaa______j2xw:10.0.0.6:/FileSystem1

Tagging of OKE related resources

When you use OKE, there are a large number of related OCI resources that you use and create: compute instances, load balancers, block volumes, node pools etc. It’s very hard to keep track of all of this, never mind understanding how much it’s costing you. We’ve now added OKE support for tagging which means you can now tag resources used and created by your clusters. With tags, you can then use them to create OCI Usage Reports, in the Cost Analysis and Budget.

Support for Capacity Reservations

We’ve also added Capacity Reservations support to OKE. This allows you to reserve a VM or bare metal capacity to ensure that these resources are available for you should you need them.

Capacity reservations can be added on a node pool basis so you can be selective about the shapes that you would be reserving, especially if you are running mixed workloads in your cluster.

Security features to secure your workloads

My colleague Greg also published 9 security features you can readily implement to improve the security posture of your OKE workloads.

Support for Expansion of Block Storage-backed PVs

You can now also expand the size of PVs backed by OCI Block Storage service. Best of all, you can do so without downtime. As an example, suppose you created a PV of size 100GB:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: oci-bv
resources:
requests:
storage: 100Gi
volumeName: pvc-bv1

And now you want to double it in size. You can just change the manifest and apply it again:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
storageClassName: oci-bv
resources:
requests:
storage: 200Gi
volumeName: pvc-bv1

Support for Network Load Balancer

OCI introduced the flexible Network Load Balancer last year, a non-proxy load balancer that performs passthrough load balancing of Layer 3 and 4 (TCP, UDP/ICMP) workloads. With the support of NLB, you can expose services that require UDP such as VoIP, real-time streaming of video, online gaming and IoT among others on OKE.

Cluster API for OCI a.k.a CAPOCI

We also open sourced CAPOCI, a Cluster API implementation for OCI. Cluster API is a Kubernetes sub-project that provides APIs and tooling to simplify the provisioning, upgrading and management of multiple Kubernetes clusters using different infrastructure providers. By using a common API, Cluster API makes the experience of managing the lifecycle of Kubernetes clusters among many infrastructure providers consistent.

Cluster API is really cool. Until now, if you wanted to create a new self-managed Kubernetes cluster, you would have to use Terraform or the new shiny Pulumi OCI provider, some bash scripting. With Cluster API, you run you CAPOCI in a management cluster, which is any Kubernetes-compliant cluster. Using the management cluster, you can then provision a self-managed Kubernetes cluster without having to manually create computes, scripts etc. You can run the management cluster on OKE to create self-managed clusters or if you have more complex scenarios e.g. hybrid/multi-cloud, create Kubernetes clusters on other infrastructure providers as well. The project is currently being developed and maintained on GitHub/oracle but we hope to transfer this to the kubernetes-sig organization soon.

Secure Deployments to private OKE Clusters

As most of you now know, OKE can be deployed in both public and private clusters. A public cluster is one whose API server is publicly reachable whereas a private cluster is one who API server is only reachable within the VCN or to networks that the VCN has been extended to through VPN, FastConnect or Hub-And-Spoke Models.

Until recently, if you used the DevOps service, you would need to use public clusters but that is a thing of the past. Now, you can use DevOps with private clusters too. This means that your code, build and deployment never has to traverse the public internet, enhancing the security of your deployment process.

OCI Service Mesh

Source: National History Museum

Service Mesh is having something of a Cambrian explosion. CNCF on its own has 5 of them and that’s before Istio has been accepted. Recently, we also announced OCI Service Mesh. It supports any application running on OKE. I haven’t taken it for a spin yet so I won’t elaborate any further on it.

Load Balancer Node Selectors for OKE

This is just a couple of days ago and my colleague Ajay does a much better job explaining it and the benefits.

Optimized Worker images

This is also a very recent release. When you provision a node pool, you would select an Oracle Linux image based on your desired CPU architecture (x86_64, aarch64, x86_64/CUDA) or a custom image based. Under the hood the Kubernetes packages would be installed from scratch and once this process is completed, then the worker nodes would join the cluster. Unfortunately, it also takes annoyingly long to provision new node pools.

With optimized worker images, the Kubernetes packages are pre-installed. This means the worker nodes can skip the Kubernetes installation and join the cluster once they have finished booting and the Kubernetes processes started. This leads to a considerably improved node pool provisioning time. From the 4.2.4 release, we are already supporting this feature in the terraform-oci-oke module.

Block volume performance levels

You can now also control the performance levels of block volumes. By default, they are configured at balanced levels (10 VPUs). By using this feature, you can configure PVCs that run at a higher performance levels. First, define a Storage Class. You’ll notice it uses the CSI Volume plugin:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: oci-high
provisioner: blockvolume.csi.oraclecloud.com
parameters:
vpusPerGB: "20"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Notice in the parameters section we specify the higher VPUs (Volume Performance Units). You can specify VPUs up to Higher Performance i.e. 20.

Once the new Storage Class is created, you can create a new PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: oci-pvc-high
spec:
storageClassName: oci-high
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi

VPUs is a way to improve your block volume performance by increasing IOPS/GB. You can also choose to purchase fewer VPUs which reduces the performance but also reduce your cost. Thus, it’s a trade-off. By adding the ability of configuring the VPUs, you now have the option of consciously making this choice yourself depending on your application needs and budget.

Summary

These are the features that we have made publicly available. There are lot more exciting things coming and we’ll be sharing them with you as soon as they are available.

Meanwhile, join us on our public Slack!

--

--