Running a modern infrastructure on Kubernetes

Pradeep Chhetri
Feb 9, 2018 · 5 min read

At StashAway, we have been running Docker containers from the very first day. Initially, we were using Rancher as the container orchestrator, but as we grew, we decided to switch to Kubernetes (k8s) — mainly because of its rapidly growing ecosystem and wide adoption.

This post describes how we use k8s and its tooling stack to run our application in a production-grade environment.

Google trends graph showing how the interest on various container schedulers changed over time.

All applications whether stateless or stateful needs an environment with these fundamental necessities built-in:

Service Discovery

If you are for example running Cassandra inside a container, its IP address will be available both as an environment variable CASSANDRA_HOST, as well as a domain name cassandra.default.svc.cluster.local.

DNS-based service discovery is more popular among the two but special care needs to be taken since some DNS client libraries set high DNS cache TTL values. Eg. JVM caches domain names forever by default.

Service Addressing

In k8s world, this can be achieved easily via annotations using ExternalDNS. This incubator project takes care of registering a new (sub-)domain as soon as any new k8s service or ingress controller is created. It is also aware of the records it manages via an extra TXT record along with the primary A record, hence preventing any accidental overwriting of existing records.

Routing

Current Ingress controller implementations include Nginx Ingress (Nginx based), Voyager (HAProxy based) and Contour (EnvoyProxy based). The first one is the most matured which we are using (along with ELB) for all our traffic routing — but it provides only HTTP based routing. For TCP based routing, you’ll need to use Voyager. Contour is very interesting since it comes along with all the benefits of Envoy which is a service proxy designed specifically for modern cloud native applications. It has first class support for gRPC and provides features like circuit breaking which are not available in standard load-balancers.

Monitoring

Prometheus is definitely the right choice available in open-source to monitor your Kubernetes apps and cluster. It has an inbuilt discovery for these k8s objects. Since monitoring without alerting is useless, Alertmanager perfectly fills the gap by providing nice integrations like Slack notifications.

Most people use Prometheus along with Heapster which can be integrated with many open-source monitoring solutions like InfluxDB and Riemann. Those who want to get fine container level metrics can add cAdvisor to their monitoring stack, too.

Logging

Fluent-bit is a lightweight (alternative to Fluentd) and is a fully Docker- and Kubernetes-aware agent which can be used to push these logs directly to Elasticsearch. It automatically adds kubernetes labels and annotations in each log line. You can also integrate it with Slack for sending notifications in case of any error/exception.

Deploying

Since Helm doesn’t provide a neat way to store secrets, we use Ansible Vault as their source of truth. We trigger the helm command-line via Ansible using the ansible-helm module.

One of the pain-points of helm is that someone needs to write these charts by first understanding each of the YAML fields. Ksonnet is going to remove it by dynamically generating helm charts on demand.

SSL Certificates

Provisioning, installing & updating these certificates can become cumbersome, if it is not automated properly.

Automatic provisioning of Let’s Encrypt certificates for k8s ingresses can be done via kube-cert-manager. We chose this over kube-lego since it has support for Let’s encrypt DNS based validation challenges. Hence it can be used for issuing certs for applications which are hosted in a private network. It also takes care of renewing these certificates.

JetStack folks are developing another tool named cert-manager which is pretty interesting since it will soon be able to use Hashicorp Vault as a CA authority.

Conclusion

We are constantly on the lookout for great tech talent to join our team — visit our website to learn more and feel free to reach out to us!

StashAway Engineering

StashAway’s Engineering Blog