On using the Kubernetes Resource Model for Declarative Configuration

Brian Grant
4 min readApr 13, 2024

--

Kubernetes is far from perfect (we had to build it really fast!), but one of the ancillary design decisions in Kubernetes that worked out very well was making the at-rest serialization format the same as the API wire format. I touched on this in my reflections on declarative configuration, but wanted to provide more background on how and why this is so useful, and also touch upon some of the consequences that have been debated.

I coined the term Kubernetes Resource Model (KRM) about 6 years ago. Resources, as in the REST API meaning, are the entities that Kubernetes manages, such as Pods. These resources have a strong degree of consistency with respect to their names, REST paths, operations, metadata, structure, access control, admission control, validation, and so on.

Kubernetes supports standard management operations (Create, Read, Update, Delete, List, Watch) and metadata (e.g., names, labels, annotations) on its resources. This enables (mostly) general-purpose read, modify, write operations on groups of them, which kubectl leveraged early on to perform bulk operations on heterogeneous sets of resources.

Consistent definition of resource metadata, such as kind, API version, name, and namespace, enables generic client code to operate on resources in a standard way without compile-time knowledge of the individual resource definitions. For instance, REST API URLs can be constructed in a standard way from this information, and/or one could look up the appropriate OpenAPI specification for the resource type.

Consequently, no fat client libraries or plugins along the lines of Terraform providers are necessary (now that we have server-side apply). It’s possible to achieve similar consistency in other API suites, but it’s expensive to retrofit. We made a sweeping overhaul of the Kubernetes API prior to 1.0 in order to achieve it.

These properties enable integration of N resource types in a client library or tool with O(1) work rather than O(N) work. That greatly reduces the amount of effort required to create tools. This is important in an ecosystem with tens of thousands of unique resource types thanks to CRDs.

But it’s not just API calls that are made simpler. Kubernetes resources are self-describing at rest. Early on, users programming against the Kubernetes REST API may have wondered why some information had to be specified seemingly redundantly in both the REST URL and in the resource body. Well, this is the reason.

Resources being self-describing at rest enables resource data to be serialized and persisted on disk, in source control, in images, or in a database, and then used for a wide variety of purposes, including those where the type of resource and version of its schema must be known in order to understand the data, such as:

  • analysis, auditing, cost estimation
  • state-based constraint validation
  • restoring or recreating resources in export/import, backup/restore, save for later, cloning, or testing scenarios
  • using the serialized resource representation as a template, as with Helm

That last case, of course, was one of the main scenarios we intended to address:

“From the configuration source, we advocate the generation of the set of objects you wish to be instantiated. The resulting objects should be represented using a simple data format syntax, such as YAML or JSON. … Once the literal objects have been generated, it should be possible to perform a number of management operations on these objects in the system, such as to create them, update them, delete them, or delete them and then recreate them (which may be necessary if the objects are not 100% updatable). This will be achieved by communicating with the system’s RESTful APIs. In particular, objects will be created and/or updated via a reconciliation process. In order to do this in an extensible fashion, we will impose some compliance requirements upon APIs that can be targeted by this library/tool.”

One change we made in the API overhaul to make this simpler was to introduce the spec and status fields. This made it easier for Kubernetes controllers and declarative clients to not conflict at a coarse level, but other server-side changes, such as from default values, mutating admission controllers, service clusterIP allocation, and horizontal and vertical pod autoscaling, needed to be merged with declarative intent. For that, we first developed strategic merge patch for kubectl apply and later kustomize, and then much later server-side apply. The ability to merge multiple sources of truth has facilitated greater interoperability of automation components. This is more painful in tools and APIs lacking a general model for it.

The most glaring omission from the Kubernetes Resource Model was the lack of a universal status property. It’s easy to apply resources of a hundred different types, but not easy to determine whether all of those resources are working as intended. Crossplane, ArgoCD, and other systems built on Kubernetes developed their own solutions to this problem, but I still think that standardizing it in the Kubernetes API and CRD ecosystem would be valuable, even if it just became a convention-over-configuration-style default.

Now, regarding consequences…

As with the broader Kubernetes / CNCF ecosystem, the configuration tool ecosystem for Kubernetes is large and fragmented. Making it easier to build such tools resulted in people building many dozens, possibly hundreds of tools. Helm and Kustomize are still the most popular tools based on the data I’ve seen, but there is a large array of others, including general-purpose Infrastructure as Code tools like Terraform and Pulumi, other general-purpose language-based tools like cdk8s, configuration languages like Jsonnet and CUE, simple text-based tools such as envsubst, sed, and Jinja, and many more.

While this means that there isn’t just One Language to configure Kubernetes, users were able to get unblocked quickly before Helm and Kustomize were built and today are able to choose whatever representations they prefer, whether that be YAML or Typescript. Moreover, KRM as a serialization format provides a higher-level foundation for the next generation of innovative configuration tools, whether Infrastructure from Code, diagram-based, function-based, LLM-based, or something else. I certainly look forward to a better solution than what we’ve been able to come up with so far.

If you found this interesting, you may be interested in other posts in my Infrastructure as Code and Declarative Configuration series.

--

--

Brian Grant

Original lead architect of Kubernetes and its declarative model. Former Uber Tech Lead of Google Cloud's API standards, SDK, CLI, and Infrastructure as Code.