From dumpster fire to sparkling clean: SaaS with Kubernetes Operators and garbage collection

Alexander Held
Mercedes-Benz Tech Innovation
7 min readSep 5, 2024
Signed-off-by: Alexander Held

Since early 2019 we as Mercedes-Benz Tech Innovation, together with Mercedes Benz, offer “Runtime Extensions”, which are managed services for their fully-managed Kubernetes platform. Currently we have over 500 different active projects which consume Runtime Extensions including services like Grafana, Prometheus, Opensearch, MongoDB and PostgreSQL.

As Mercedes-Benz Tech Innovation gained more traction in the company and therefore more customers, we changed architectures as we needed to scale over multiple Kubernetes clusters in different zones.

Let’s be real for a moment. Not everything was great when we started our journey without the knowledge of how to think and design platforms in a Kubernetes idiomatic way. Before we refactored our platform, we had one CustomResourceDefinition (CRD), which contained all services per project, resulting in significant lag, and an overall sluggish experience for our customers.

In this blogpost we will show you how using an operator native architecture reduced the complexity in our code-base, helped us write sparkling clean code, and made it easier for us to support this platform 24/7.

What is an Operator and what does it do?

In a nutshell: The Kubernetes controller pattern is a loop where the desired State (defined as CustomResources (CR)) gets observed, the difference to the current environment gets calculated, and then the necessary changes to resolve the difference are getting applied.

A general overview about our approach

In this article we use the “PostgreSQL-Operator” as an example highlighting two use-cases:

  1. Configure and deploy the actual database
  2. Creating backups

The PostgresSQL-Operator consists of various controllers, and each one is responsible for exactly one thing. We have a separate controller for each CRD — sometimes even multiple. This makes adding features pretty easy, because you often don’t need to modify existing code.

Before we can delve into the nitty-gritty details, we need to have an overview of how our system is designed.

We have a “workload cluster” where we, as a SaaS provider, will run the databases and other resource-hungry services for our customer. Our customers can bring their own Kubernetes clusters, which they can link to their project and we create tunnels inside their cluster. They can access our databases from either inside their cluster or from using an ingress into our cluster.

Let’s look at our above examples:

  1. Configure and deploy the actual database
apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
metadata:
name: elephant
namespace: customer-project-name
uid: 1234
spec:
displayName: my-elephant-cluster
memorySize: 2
postgresDiskSize: 1
walArchiving: true

We create a PostgresCluster CRD, which is at the top of the CRD-hierarchy. It’s sheer existence tells us that a customer has ordered a PostgreSQL database. The customer might configure things like disk space, memory, or PostgreSQL specific parameters and we make sure that there are StatefulSets, ConfigMaps, … that conform to those specs.

apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresRemoteCluster
metadata:
name: customer-cluster
namespace: customer-project-name
ownerReferences:
- apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
name: elephant
uid: 1234
spec:
cluster:
name: elephant
remoteCluster:
name: customer-cluster

When a customer connects their cluster to their project on our website, we create a PostgresRemoteCluster in the workload-cluster. The PostgreSQL-Operator monitors the creation and detects that there is a link to the “elephant” PostgresCluster. It realizes that there should be a tunnel in the customer cluster, and since it does not exist yet, it creates one.

2. Creating backups

We fully embraced translating the Business Requirements into CRD.

Have a look at our backup process:

  • Customers can request us to back up their data
  • Our systems need to schedule backups to ensure that we have a backup every day
  • Customers can restore backups that have been created

Okay okay. There is a lot happening in the diagram above. Let’s break it down.

apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresBackupRequest
metadata:
name: my-backup
ownerReferences:
- apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
name: elephant
uid: 1234
spec:
cluster:
name: elephant

Our customer can press a button on our website to let our API create a PostgresBackupRequest. It contains a reference to a PostgresCluster which should be backed-up.

apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresBackup
metadata:
name: my-backup
ownerReferences:
- apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
name: elephant
uid: 1234
spec:
cluster:
name: elephant
encryption: dare
passphrase: super_secret_p4ssphr4s3=
s3:
bucket: my-bucket
path: path/to/backup.tar.gzip.dare
size: 50Gi

The Operator now deploys a job which connects to the elephant database, creates a backup and saves the backup, into a s3 bucket. While doing that, it periodically reports the progress inside the PostgresBackupRequest/status resource. After the backup is completed, the job creates a PostgresBackup containing metadata and information about the backups’ location on s3.

apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresBackupSchedule
metadata:
name: default
ownerReferences:
- apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
name: elephant
uid: 1234
spec:
interval: 24
cluster:
name: elephant

We leverage this mechanism for system backups too! We can map the business requirement to periodically backup the customers’ data by creating a PostgresBackupSchedule. Whenever it gets reconciled, it calculates when to next create a PostgresBackupRequest which starts the whole process above.

The PostgresBackup CR is used to display backups on our website, restore databases using the information stored inside their specs, and to eventually free up s3 storage.

But there is more! Let’s have a look at my favorite pattern:

Owner References

Maybe you have noticed the following part of the metadata in every of our CRDs. It’s part of the Kubernetes metadata and you can specifiy OwnerReferences there.

ownerReferences:                                                   
- apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
name: elephant
uid: 1234

In Kubernetes, OwnerReferences are used to establish a relationship between resources, where one resource “owns” another resource. This relationship is used to manage the lifecycle of the resources, such that when the owner resource is deleted, any owned resources are deleted as well.

You can see that all resources which are scoped to the PostgresClusterelephant” (PostgresBackupRequest, PostgresBackup, PostgresBackupSchedule, PostgresRemoteCluster, …) have an OwnerRefernce to “elephant”. When the customer deletes the PostgresCluster “elephant”, we ensure on a Kubernetes level, that all resources owned by “elephant” will eventually be removed.

Everything gets cleaned up nicely and tidy. That’s the magic of Kubernetes.

Finalizers

You might wonder what happens with the backups on s3? They certainly have no OwnerReferences!

Nice catch! That’s why we additionally add Finalizers to control the deletion of resources.

apiVersion: postgres.mercedes-benz.com/v1                   
kind: PostgresBackup
metadata:
deletionTimestamp: "2023-10-24T13:33:39Z"
finalizers:
- postgresbackup.postgres.mercedes-benz.com/s3
name: my-backup
ownerReferences:
- apiVersion: postgres.mercedes-benz.com/v1
kind: PostgresCluster
name: elephant
uid: 1234
spec:
cluster:
name: elephant
s3:
bucket: my-bucket
path: path/to/backup.tar.gzip.dare

In the example of our PostgresBackup resource we ensure that there is always a finalizer (basically a label) present. As soon as the resource gets deleted the deletionTimestamp gets set by Kubernetes and we execute an action associated with the finalizer.

In our case postgresbackup.postgres.mercedes-benz.com/s3 triggers the deletion of the data on the s3. Once we complete the action, we remove the finalizer and start a new operator loop until we have no finalizers left.

Because Kubernetes will block the final deletion of the resource until there is no finalizer present, we can easily monitor whether there were any issues, since we would annotate the PostgresBackup with Events.

Events

Ever wondered why there is a section called Events in the output of kubectl describe that is not part of the actual resource?

Name:         my-backup
API Version: postgres.mercedes-benz.com/v1
Kind: PostgresBackup
Metadata:
Finalizers:
postgresbackup.rx.mercedes-benz.com/s3
Owner References:
API Version: postgres.mercedes-benz.com/v
Kind: PostgresCluster
Name: elephant
UID: 1234
Spec:
Cluster:
Name: my-backup
Display Name: system-backup
Encryption: dare
Passphrase: super_secret_p4ssphr4s3=
Run Duration: 1s
s3:
Bucket: my-bucket
Endpoint: s3-endpoint.example.com
Path: path/to/backup.tar.gzip.dare
Triggered By: system
Status:
Wal Cleanup Done: true
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Finalizer 3m31s (x5 over 3m) PostgresBackup Deleting backup my-backup from S3
Warning Error 3m31s (x5 over 3m) PostgresBackup The specified bucket does not exist

Let’s look at our example PostgresBackup. At the bottom of the kubectl describe output, we can see human-readable error messages about “what happens when”. Events are Kubernetes native resources that are basically annotations to an involved object.

apiVersion: v1
count: 5
involvedObject:
apiVersion: postgres.rx.mercedes-benz.com/v1alpha1
kind: PostgresBackup
name: my-backup
kind: Event
message: The specified bucket does not exist
reason: Error
reportingComponent: PostgresOperator
type: Warning

They get displayed when you use kubectl describe or tools like k9s. This helps a lot, when you need to troubleshoot something not working as expected. We found various bugs, deadlocks and timing issues just by looking at the sequence of Events.

Summary

  • Operators observe Kubernetes resources, compare the desired state with the actual state and apply a delta resolving those differences
  • Operators help automate things in a Kubernetes context
  • Operators can consist of multiple controllers which do one thing each
  • Business workflows and semantics can be expressed through CustomResourceDefinitions, almost like Object Orientated Programming
  • Kubernetes handles garbage collection through OwnerReferences
  • The deletion of resources can be fine-tuned by leveraging Finalizers
  • Events can help developers and Kubernetes users to better understand what’s happening and empower them to fix bugs and write better code

--

--

Alexander Held
Mercedes-Benz Tech Innovation
0 Followers

Alex, a software engineer with 5+ years of experience in developing distributed systems & cloud-native applications has a passion for sharing his knowledge.