Essential Kubernetes Resources
The resources you should be using
Kubernetes can seem daunting. With a fresh cluster and limited experience, you face a salvo of advice, best practice, “must-haves,” guides, and documentation. Before you know it, you’re drowning and you’re grasping for anything.
Well, fear not. I’ve been building applications in Kubernetes for a little while, and I’ve assembled a list of resources to use. These resources will ensure you’re complying with security, resilience, and availability best practices. If you deploy something to your cluster, you should at least consider using these resources. I’ve taken them step-by-step, so you can see why I’ve added each resource is added and the value you stand to gain. I’ve also sprinkled some of my own experience on top. Let’s get to it!
Deployment resource is your basic building block. It’s right at the foundation of your application. It’s so foundational, people might be wondering how they could deploy an application without it. Well, run this command locally and check what resources are created:
kubectl run nginx --image=nginx --restart=Never
You’ll find that only a
Pod exists. People have done this before as part of their CI/CD. It seems silly, but as a first pass, this doesn’t seem insane. Get something running in the simplest way possible.
Why a deployment?
With a deployment, you can declaratively state how many instances of your pod you would like, you can define rollout strategies, gain self-healing behaviour, and much more. This provides a scalable platform to deploy your application to.
When shouldn’t you use a deployment?
The only time to not use a
Deployment is when you’re deploying something that doesn’t need to run all the time, like a
Job. In which case, your pod will be controlled by a
Job resource. I’ve yet to see a use case where you require only a lone
Pod resource — I’d be suspicious of anyone who says they do!
Make use of affinity!
podAntiAffinity. There are lots of types of affinity — you can read about them here. A little gem that I regularly use is
podAntiAffinity. This tells Kubernetes that no two of your pods should be on the same node. In the event of a node failure, you won’t lose all of your pods at once.
You can make this a
should rather than a
must. This means it’s not guaranteed to be on separate nodes, but it also means that autoscaling will probably be faster since you’ve got less chance of waiting for a new node to spin up.
Be careful with resource scaling
When you declare your
Deployment, you can define some resources to use. If you don’t define any resources, this means your pod runs as a
BestEffort quality of service. Kubernetes has no idea how much power it needs, so it just puts it anywhere. This is bad, bad news.
BestEffort pods can kill neighbour containers.
To make your cluster as predictable as possible, I always advocate for a
Guaranteed quality of service. This means that Kubernetes knows the exact number of resources your pod will need. Sure, your pods won’t always use that space, but they won’t also scale into space that another pod needs.
Pod Disruption Budget (PDB)
So, you’ve got a deployment across a few nodes and you’re happy with that. What if someone starts performing node maintenance? If your team is competent, they’ll be making use of the
kubectl drain command. A typical command looks like this:
kubectl drain <node name> --ignore-daemonsets --force --delete-local-data
This will blow away everything that’s currently running on the node. The problem? If you’ve got a pod running on that node, you’ve just lost an instance of your app. If that’s a HTTP API, you’ve potentially impacted response times and high availability.
Why a PDB?
A PDB will ensure that a minimum number of your pods are running at any given time. It will actually block a
kubectl drain until your new pod has span up. This helps to create not only zero downtime node maintenance, but also zero-impact node maintenance.
A typical PDB looks a little like the following. In it, we’re stating that your new nginx pod can handle a disruption of one pod. If someone gets a little liberal with the
kubectl drain command, it will block it until your new pod has recovered elsewhere.
Think before making your PDB bulletproof!
It might be tempting to demand that you tolerate absolutely no interruptions. The upshot of this is going to be more of a headache for engineers who are performing node maintenance. You don’t want to make a headache for the people who keep your application safe. Designing your application to be fault-tolerant is a basic standard of modern software engineering — you should allow for a little disruption.
Horizontal Pod Autoscaler (HPA)
So, node maintenance is super quick. Your application is self-healing and grouped sensibly inside of a deployment. Pack up and go home? Hell no! Let’s take a look at this chart of network traffic that I’ve stolen from the internet:
How are you going to handle these sudden increases in traffic? How are you going to survive your marketing department and their new stunt: “Our products are 10% of their normal price, only for the next hour”…?! Great.
Why an HPA?
An HPA will enable your application to scale out on CPU, memory, or any custom metric you like. Much like an AutoScalingGroup in AWS, it will automatically spin up new pods, up to a preset maximum. This will enable you to quickly handle spikes in network traffic without overloading and ultimately destroying your running pods.
But be careful how you scale
Goldratt’s Theory of Constraints tells us that our system can only move as fast as its greatest constraint. If your application is backed by a database that’s running at 99%, more pods aren’t going to fix a thing. Most likely, they’re going to cause even more outages.
And make sure you experiment a little
Doubtless, you’ll find your initial attempts at autoscaling a little slow. It’ll miss the spike by a few seconds. Then you’ll make it trigger happy and every request will spin up new pods. It takes some work to get this right.
Roles Based Access (RBAC) and Service Account (SA)
Your pod is looking pretty sweet these days! It wasn’t long ago that a process failure would have killed your application outright. Now it can withstand most of what the world throws at it. What’s next?
Well, let’s imagine that someone gains unauthorised access to your container. You’ve opened up a port, or you’ve gotten drunk and pushed out the SSH password credentials onto twitter in a senseless act of corporate violence. Either way, as it stands, what can a hacker do?
Well, whatever the hell they like! If they can run some commands, they can wreak all sorts of havoc within your environment. Delete pods, stateful sets, read secrets — the world is their oyster.
Why RBAC & SA?
Service Account will provide some internal permissions for your application. If your application doesn’t need to do anything to the cluster, it’s not going to need any permissions, is it?
Kubernetes has you covered for this. Provided you’ve got RBAC set up inside your cluster, deploying your application will automatically give your pod the
default SA for that namespace, which has no permissions. This means you don’t need to manually declare it unless your pod needs to do something with the Kubernetes API.
InfoSec experts love to talk about the Principle of Least Privilege. It’s an easy one to describe: give something only the permissions it needs. Nothing more, nothing less. Common sense, right? Unfortunately not. The “just in case” mindset is common here. It doesn’t need to edit pods right now, but it might in the future, right?
The answer is simple. If it doesn’t need it, don’t give it. If it needs it, introduce it. Make a deal with yourself right now. From now on, thou shalt only give permissions to those that need it…desperately. Needs it like Microsoft needs an acquisition and Apple needs to continue to exploit the labour of third world workers to further increase their already grotesque profit margins.
Your pod has limited permissions, self-heals, autoscales, is highly available, and it runs like a boss. What’s left!? Well, once again, let’s assume you’ve been on the whiskey again. You let out some SSH credentials, and before you know it, your container has been compromised. Seriously, lay off the sauce!
It’s OK, your attacker can’t do much from your pod. But…what if this malicious user can get onto another pod? I wonder what
ServiceAccount the other pod has?!
Why a NetworkPolicy?
NetworkPolicy will block traffic in and out of your pod. It behaves a little like a firewall rule, or the bouncer at a nightclub. If you’re not on the list, you ain’t coming in. Except, to keep the analogy consistent, the bouncer isn’t letting anyone out of the club either…which is a little creepy. So, more like a prison guard.
NetworkPolicy is going to make sure that only the intentional traffic is possible. This creates a bit of an overhead for you or your engineers, but it’s non-negotiable. You can’t have pods reaching out into the universe whenever they feel like it. If they can do it, an attacker can do it. It might not be convenient, but damn it’s useful!
But it’ll block all traffic… including DNS
If you read this and decide to go all Fort Knox on your pod, guess again. If you create a policy like this, you’re going to run into some headaches:
It turns out that your pod needs to be able to communicate with your DNS server. DNS is a remarkably useful feature, and, while I like security, I like DNS more. So, how do you allow only DNS?
- protocol: UDP
This will allow your pod to run, but it won’t be able to talk to a single external service. If this is all good, then you’re safe.
And think about your default behaviours too
If you want to make sure that no one can deploy an open pod by default, you can set up a
NetworkPolicy that’ll clear up any stragglers. We call this our
This will only apply to the
default namespace, so be aware of that. This will ensure that anyone who hasn’t set up their
NetworkPolicy resources will need to or they won’t get far. Security by blackmail — powerful stuff!
You’ve got a pod that self-heals, survives node maintenance with minimal disruption, autoscales, has minimal permissions, and has locked down network access. Imagine the amount of work it would have taken to implement this in a more classical architecture. A few yaml files, a few bad jokes (sorry about those), and you’re away.