When I started down the path of writing this post about
Kubernetes at Earnest I quickly realized that it was a rather large topic. There are a lot of variables when maintaining any stable Infrastructure stack. For Kubernetes we could be talking about any number of things:
- Cluster life cycle management on AWS
- Authentication and authorization
- High availability¹
- Load balancing
- Service discovery
- Monitoring, logging, and alerting¹
For this edition, I decided to focus on one of the most basic operations in k8s — Deploying a container.
As most companies making a foray into microservices, in the beginning at Earnest, there was a monolithic stack on Amazon Web Services (AWS). A few of us were at KubeCon 2015 and when there a question around who was operating Kubernetes (k8s) in production, there were just a handful of companies doing this at that time. Earnest Infrastructure team has been running Kubernetes successfully² in “production” since then. We went from two services in 2015 to over 80 jobs and services that we currently run per environment. During this process, we had a few incidents and a lot of fun learning about managing highly available services on Kubernetes.
Circa k8s v1.1, we decided to deploy containers as a ReplicationController(rc). For applications to be able to take advantage of environment specific configuration, we decided to provide an environment variable called
APP_ENV. We run our clusters across three different availability zones on AWS. We wanted high availability during k8s
rolling-update deployments and outages. So we decided we would run two pods per microservice. We wanted our deployments to be deterministic, so we made a conscious decision to never use the latest container version tag. Here is an example of what our initial microservice rc looked like.³ (Many of these examples have been simplified for brevity.)
The k8s controller manager is responsible for keeping an eye on rcs to ensure that services are operating as expected, and if they’re not at a desired state, to then take appropriate actions to get the service to such a state. For this to happen, the controller manager needs to know what makes a service “live”. How do we know, or rather, how will the k8s controller manager know if the rc for the application we just deployed is working as expected?
Enter the K8s
spec.template.spec.containers, which checks to make sure that the port we said the pod should be listening on is accepting connections.
How do you know that the service is doing what its supposed to do just because it’s listening on a port?
We realized this pretty quickly as well and added an
/available endpoint to all our services and service templates. This is a standard endpoint that we expect all Earnest services to provide and respond with an HTTP status 200 to all
GET requests. With that in place, our
livenessProbe changed to
Resource Requests and Resource Limits
As the number of microservices we deployed increased we began to experience noisy neighbor problems where non-performant services sometimes caused collocated pods on the same k8s node to be evicted due to resource constraints. We run our k8s workloads on similar EC2 instances to make it easier for us to take advantage of AWS reserved instances. So we also started thinking about instance density to optimize for costs and make it easier for the scheduler to make predictable scheduling or eviction decisions.
We looked at generic usage patterns across all our services and came up with a set of defaults⁴. We applied k8s resource limits at
spec.template.spec.containers across all our services based on these defaults. Pod resource limits are set at the container level, so if we have multiple containers within a pod, we would have multiple resource limits per container.
With the advent of k8s Deployment controller we realized that we could use a few features and abstractions that it provides. The ability to record deployment history, query rollout status, ensure service availability during a deploy, and the service rollout strategy are all crucial to us and our CI/CD system. We also introduced some health check defaults that introduce a small delay before the first
livenessProbe check happens with
initialDelaySeconds. We pair this with the concept of checking frequently and failing quickly with
timeoutSeconds. With all these previous and new assumptions tied in, this is what a K8s
Deployment service file looked like for us:
This k8s deployment template covered most of our use cases. We added more and more services, and eventually saw a pattern emerge with our JVM (Java / Scala) based services. JVM is CPU intensive during startup. With the CPU resource limits we had in place, even after the service startup is complete some critical endpoints were slow to respond to some initial requests. We began experiencing service timeouts when pods would start receiving traffic before the application startup is complete.
We took advantage of the k8s
readinessProbe feature to solve these problems. We introduced a
/ready endpoint that is in the critical execution path for all our applications. And with our previously declared
maxUnavailable value of 0, we ensure that k8s pods are not put into service until the boot process is complete and are able to function as intended.
With the plethora of features that k8s currently offers, it may feel overwhelming when you’re first starting to deploy a microservice. Earnest started with a minimal set of features in the beginning. As our needs and complexity grew we augmented that feature set to solve real problems we encountered. The practice of iterating over implementing a feature, validating functionality in a production environment, and gaining insights into real-world feature behavior has worked out well for us so far. There are a lot of additional k8s features we are exploring and integrating into our stack — pod lifecycle hooks, vertical and horizontal pod autoscaling, taints and tolerations, pod disruption budgets, etc. We hope to share those in future posts.
- Applies to a Kubernetes cluster and the microservices that run on the kubernetes cluster.
- Which enterprising Infrastructure team never had any incidents, amirite?
- Examples provided are based on the latest k8s API version, feel free to ignore the apiVersion.
- We ran into some interesting problems with JVM based services written in Java/Scala where some of these principles don’t apply.