Over-Provisioned and Over-Permissioned Containers & Kubernetes
Written By: Kendall Miller
If you live in operations you know every single decision is just one in a seemingly endless series of trade-offs. The “easiest to use systems” are completely accessible by anyone from anywhere, but those are also the least secure. You can turn the dials up to 11 for something, but you’re just turning the dials down to 1 for something else as a consequence, and there is always a trade-off.
If there were an obvious solution for every single decision, experts would get paid a lot less to help people navigate these trade-offs. Usually, however, there isn’t an obvious “one size fits all” because the needs of every organization are unique to that specific place, time, and use case.
Often we tend to bias too liberally in our settings, but getting things right is worth the effort. This practice holds true throughout the IT world and is a real problem with containers and Kubernetes. These dynamic environments are ever-changing and complex. Keeping on top of what’s been provisioned or permissioned changes regularly. Configurations could be wasting money or introducing risk.
I have a friend who works for a company with a hard line of “if you build it, you own it.” This means: don’t come up with an idea for a service or feature you’re going to build if you don’t own it all the way through to production-including pager and after-hours response to any issues it’s going to have. I asked if that policy ever kept people from developing something great because they didn’t want to have manage it after hours, I was assured that it probably had happened.
If I was a developer with a great idea that’s costly to maintain, I wouldn’t want to own a thing which is going to be emotionally expensive to manage. So I’d either not build, or if I did, I’d over provision the h*ck out of the thing. I’d allocate way more resources, way more instances, memory, compute etc… and make sure the service has so much capacity for scale and traffic it’s never going to go down.
As the owner, it’s my job to make sure it’s up and running, and while I have to ask permission to reimburse a $50 book that will help me do my job better, I don’t have to ask any permission to spin up $200,000 worth of additional instances in our production infrastructure (if your organization is like so many out there). I’m going to go do what makes me most comfortable and ensures the thing I’ve built stays up at all costs (literally).
This is, literally, the definition of over-provisioned. And am I ever going to audit my logs and usage reports to see if I could cut the spend in 1/2 and still have a comfortable buffer? Likely no, because I don’t have the tools to help me do so-AND I’m not incentivized to do so.
This is what leads to over-provisioning being the default in so many places.
The problem with being too liberal in our defaults doesn’t fall just on the provisioning side, it can also fall on the permissioning side.
Interestingly I see this happen much less when people are the ones being given permissions (people tend to be overly restricted within organizations out of old HR scars from someone on the operations/IT team), yet much more when computers are involved. For whatever reason we’ve made it super common to give a senior leader in our organization almost zero permissions for accessing things (often they don’t need that access, don’t get me wrong)-but we give our applications, and containers, and instances, and networks, all the access we can.
It’s easy to think, “it shouldn’t be difficult to access that, so lets make it open.” This leads to one of the most common situations we see in services configurations-containers running as root in a cluster at an astonishing rate. The problem is, if a bad actor finds a single part of your infrastructure that is over-permissioned, they can get just about anywhere else. You may have your database perfectly locked down, but if a service that has access to that database has the permissions for anyone to get in, your database is as good as compromised.
Over-permissioned computers are a super common single-point of failure leading to security incidents.
Too much of a good thing
To get anything done you need the resources and the permissions to do them. But like with anything, the right tool goes a long way. Think of digging the trench system in a yard to install a sprinkler system-it’s way easier to dig a trench with a shovel than with your hands. And even better to have a machine that is custom built for trench digging. On the other hand it’s probably not a great idea to just give everyone you know a backhoe and trust they’re going to get it where it needs to go and dig the right size holes for a sprinkler system.
There is a right size. A right leniency in permissioning.
Finding the right balance
Finding that balance can be hard without the right software to show you what you’re doing wrong.
Fairwinds Insights makes it easy to know where you’re over-provisioned, and over-permissioned. It also collects data over time so you can see how you’re doing today compared to last month. Get your settings just right. To avoid spending a fortune, either today in over-provisioning, or in the future through massive security incidents via over-permissioning.