The 12 factor application provides a well-defined guideline for developing microservices, and is a commonly used pattern to run, scale and deploy applications. In IBM Cloud Private platform, we follow the same 12-Factor application guidelines for developing containerized applications. See Kubernetes & 12-factor apps to learn how we apply and group 12-factor practices, which are supported by the Kubernetes model of container orchestration.
As we reflected on the principles of developing containerized microservices running in Kubernetes, we found that while the 12-factor application guidelines are spot-on, the following 7 factors are equally essential for a production environment: Observable, Schedulable, Upgradable, Least privilege, Auditable, Securable, Measurable.
Let’s discuss the need for the continued factors and what each factor means, starting with Factor XIII:
Factor XIII: Observable
Apps should provide visibility about current health and metrics.
Distributed systems can be a challenge to manage because multiple microservices work together to build an application. Essentially, many moving parts need to work together for a system to function. If one microservice fails, the system needs to detect it and fix it automatically. Kubernetes provides great capabilities to rescue: such as readiness and liveliness probes.
Kubernetes uses these probes to ensure the application is ready to accept traffic. If a readiness probe starts to fail, Kubernetes stops sending traffic to the pod until its readiness probe returns success status.
For example, you have an application composed of three microservices: front end, business logic, and databases. For this application, your front-end should have a readiness probe to check if business logic and databases are ready before accepting traffic. You want to check the readiness of all your dependencies in your readiness probe.
See in the following animated image that, no request is sent to the application instance until the readiness probe returns success:
You can use HTTP, Command, or TCP probe. You can control probe configurations. For instance, you can specify how often they should run, what the success and failure thresholds are, and how long to wait for responses. There is one very important setting that you need to configure when using liveness probes, which is the initialDelaySeconds setting. Ensure the probe doesn’t start until the app is ready. If not set correctly, the application restarts itself constantly. See the following YAML snippet:
readinessProbe:# an http probehttpGet:path: /readinessport: 8080initialDelaySeconds: 20periodSeconds: 5
Kubernetes uses liveliness probes to check if your application is alive or dead. If your application is alive, then Kubernetes leaves it alone. If your application is dead, Kubernetes removes the Pod and starts a new one to replace it. This validates the need for microservices to be stateless and disposable (Factor IX: Disposability). See the following animated image where Kubernetes restarts the pods once the liveliness probe fails:
A great benefit to using these probes is that you can deploy your applications, in any order, without worrying about dependencies.
However, we found that the probes are not enough for a production environment. The applications usually have application-specific metrics that need to be monitored, for example, transactions per seconds. Customers set up threshold and alerts for these application-specific metrics. IBM Cloud Private fills this gap with a completely secure monitoring stack comprised of Prometheus and Grafana enabled with role based access control model. See IBM Cloud Private cluster monitoring for more information.
Prometheus scrapes targets from the metrics endpoint. Your application needs to define the metrics endpoint by using the following annotation:
Prometheus then discovers the endpoint automatically and scrapes metrics from it, as shown in the following animated image:
Factor XIV : Schedulable
Applications should provide guidance on expected resource constraints.
Consider management picks your team to experiment with a project on Kubernetes. Your team works hard setting up the environment. You end up with an application that is running with exemplary response time and performance. Another team then follows your lead; creates their application and hosts in the same environment. When the second application goes live, the original application starts experiencing performance degradation. When you start to troubleshoot, the first place to look is the compute resource assigned (CPU and memory) to your containers. It’s very likely that your containers are starving for compute resources. And that leads to the question: How you can ensure compute resources for your applications?
Kubernetes has a great capability that allows you to set request and limits for the containers. Requests are guaranteed. If a container requests a resource, Kubernetes only schedules it on a node that can give it that resource. Limits, on the other hand, ensure a container never goes above a certain value.
See the following YAML snippet for setting compute resource:
Resources:requests:memory: “ 64Mi”cpu: “150m”limits:memory : “64Mi”cpu : “200m”
Another effective capability for administrators in a production environment is setting quota for namespaces. If a quota is set, Kubernetes does not provision containers that do not have request and limits set in that namespace. As you see in the following image resource quota is set for namespaces:
Factor XV: Upgradable
Apps must upgrade data formats from previous generations.
Security or feature patches are often needed for applications running in production. It is important for production applications to upgrade without service disruption. Kubernetes provides rolling updates for applications to upgrade with no service outage. With rolling updates, you can update one pod at a time without taking down the entire service. See the following animated image of a second version of an application, which can be rolled out with no downtime:
See the following YAML snippet:
minReadySeconds: 5strategy:# indicate which strategy you want for rolling updatetype: RollingUpdaterollingUpdate:maxSurge: 1maxUnavailable: 1
Pay attention to maxUnavailable and maxsurge when enabling rolling update strategy.
- maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. Though it’s optional, you want to set the value to ensure service availability.
- maxsurge is another optional, but critical field that tells Kubernetes the maximum number of pods that can be created over the desired number of pods.
Factor XVI Least Privilege
Containers should be running with the least privilege.
Not to sound pessimistic, but you should think of every permission you allow in your container as a potential attack, as seen in the next image. For example, if your container is running as root, then anyone with access to your container can inject malicious process into it. Kubernetes provides Pod Security Policy (PSP) that you can use to restrict access to your filesystem, host port, Linux capabilities, and more. IBM Cloud Private provides a set of out-of-the-box PSPs that can be associated when provisioning containers in a namespace. See more details at Using namespaces with Pod Security Policies.
Factor XVII: Auditable
Know what, when, who and where for all critical operations.
Auditability is critical for any actions performed on the Kubernetes cluster or at the application. For example, if your application handles credit card transactions, you need to enable auditing to keep audit trails of each transaction. IBM Cloud Private leverages the cloud agnostic industry standard format, Cloud Auditing Data Federation (CADF). See more details at Audit logging in IBM Cloud Private.
CADF event catches following information:
- initiator_id: ID of the user that performed the operation
- target_uri: CADF specific target URI, (for example: data/security/project)
- action: The action being performed, typically: operation : resource_type
Factor XVIII: Securable (Identity, Network, Scope, Certificates)
Protect app and resources from the outsiders.
This factor deserves its own article. Suffice it to say that applications need end-to-end security when running in production. IBM Cloud Private addresses the following (and more) for security that is required for production environment:
- Authentication: confirmation of identity
- Authorization: validation on what authenticated users can access
- Certificate management: management of digital certificate, including creation, storage, and renew
- Data protection: security measures for data in transit and at rest
- Network security and isolation: prevent unauthorized users and process from accessing the network
- Vulnerability Advisor: identify any security vulnerabilities in the images
- Mutation Advisor: identify any mutation in containers
You can learn more from the IBM Cloud Private Security guide.
Specifically, let’s talk about certificate manager. IBM Cloud Private certificate manager service is based on the open source Jetstack project. Certificate Manager is used to issue and manage certificates for services that run on IBM Cloud Private. It supports both self-signed and public certificates, fully integrated with kubectl and role based access control.
Factor XIX: Measurable
Application usage should be measurable for quota or chargebacks.
At the end of day, IT central has to handle the cost, as seen in the following image. The compute resources allocated to run the containers should be measurable, and organizations using the cluster should be accountable. Make sure you follow Factor XIV: Schedulable. IBM Cloud Private provides metering, which collects allocated compute resources for each container and aggregates at namespace scope for showback and chargeback.
I hope you found the topic interesting, checked off the factors you already use, and added any that you have not used.
Kubecon 2019 Shanghai talk: