At one point recently, Kubernetes was the largest open source Go codebase in existence. It is still massive, and that’s a good thing because Kubernetes is the future of cloud computing.
I’ve probably read about 30% of the core codebase. I’ve also extensively used the
k8s.io/kubernetes package as a Go SDK for talking to the Kubernetes API.
I have enough experience in that particular area to narrow down this experience report to a single type:
The Kubernetes “Type System”
runtime.Object permeates Kubernetes and related codebases. Just a few examples from that same
NewEncodable: takes a
runtime.Objectand returns a new
runtime.Objectthat can later be encoded with the right encoder
ObjectConvertor: converts one
ObjectCreator: creates any
runtime.Objectthat is registered with the internal object registry (discussed below)
bytes and a
Decoder, decodes the bytes into the given
runtime.Object is the cornerstone of the Kubernetes type system. And when I say “type system,” I really mean that the codebase has a sort of internal “type registry.”
The Kubernetes Type Registry
If you have an implementation of a
runtime.Object and you want to do something with it, you have to register it first.
Kubernetes has an internal “type registry,” and it uses the registry to identify the concrete types that it gets passed. Only then will it operate on the type.
If you squint, you can see a faint glimmer of generics here!
Unfortunately, the type registration mechanism isn’t obvious. Generally, it’s done with side-effecting imports (from
install package “registers” a new group, version, and kind (GVK) with one of the Kubernetes internal type registries. A GVK is a unique identifier for a Kubernetes type.
Kubernetes has built a type naming & identification scheme and a type registry
If you try to decode, convert, or something-else a
runtime.Object, you’ll get an error if you didn’t install its GVK already.
In short, Kubernetes has built a type naming & identification scheme and a type registry to store each GVK. It took me a long time to identify (pun intended!) this whole thing in the codebase; after I did, I was amazed and impressed.
The core team replaced a compile-time language feature that was missing (Generics) with their home-built runtime system. And given the tools at their disposal, they did a pretty good job.
Why This Isn’t Great
There are a few basic issues with this approach:
- As a user of these packages, you have to remember to put an anonymous import somewhere in your package if you want to do anything with a
- It’s unclear what package you need to anonymous-import into your package to get your
- It’s unclear which (if any) type registries are concurrency-safe
Additionally, since there’s a runtime-defined type registry, anyone writing Kubernetes core code needs to add manual type assertions and registry lookups anywhere that deals with
That implementation leads to more tests in the core, and so forth.
When This Has Gone Wrong
You could replace this global registry with generics. I’ll resist the urge to invent yet another generics syntax for Go, but trust me, it’d be beautiful! Needless to say, if we didn’t have this runtime “installation” thing, we’d end up with less core code and less confusion for callers.
The Kubernetes type system is built at runtime, so type checks must be done at runtime
I have personally spent hours debugging applications that use the Kubernetes Go client SDK; particularly apps that interact with dynamic Kubernetes types like Custom Resource Definitions and generated Aggregated API types.
Generally, these applications crash with error messages like “no kind is registered for the type …” (see the source for
notRegisteredErr for details).
While the error messages could be improved, the root cause is that the type system is built at runtime, so type checks — aside from the obvious interface implementation checks — must be done at runtime.
But What About Generated Code?
The question that transcends time!
I think generated code is super valuable in boilerplate situations like RPC clients and server stubs.
This isn’t a boilerplate situation. The code we’re talking about is at the core of Kubernetes.
The resulting codebase would be compile-time type safe, but way bigger
But generated code could technically fix the issue, so let’s discuss that for a second. We could generate a concrete
struct for each and every Kubernetes type.
Looking at the Kubernetes resource reference for v1.9, that means
structs for at least 30 Kubernetes resources (by the way there are plenty more — I just counted the stable
v1 ones). It also would make sense to remove a lot of the generalized functions and interfaces that do copies, conversions, etc…
And instead of that generalized code, we’d have 30! *(factorial) conversion functions, 30 copy functions, and so on.
Way more code and less generalization. Sigh…
The resulting codebase would be compile-time type safe, but way bigger, and mostly generated. It would also be much harder to be productive in that kind of codebase.
In my opinion, generics would really make this situation better. In most places where
runtime.Object is used (particularly where it’s returned), type information is lost, and Kubernetes has an elaborate system to make up for it.
Type registries are novel, and kind of amazing to me, but they shouldn’t have to exist here.
I’m so looking forward to the day that it doesn’t have to!