Taming Kubernetes Deployments with Lemmy

As we worked on proving the worth of building and deploying applications in Docker with an overall endgame of utilizing Kubernetes, we needed to provide the ability for developers to configure and deploy their applications into various environments. Originally the SRE team used Ansible to deploy new container builds to hosts as we went down the path of proving that they would be feasible while working with the platform team to streamline the build process.

Once we found a good pattern for building containers and the platform team was comfortable with it, we needed a better solution than Ansible to allow them to deploy new builds, and as an added bonus, a way to handle autoscaling of instances. Since Ansible is already Python we decided to build a Flask-based web application that could be triggered from either a browser or webhook to run a playbook to deploy the application. (N.B. yes, the webhook method is a home-grown version of Ansible Tower.)

The Tool :

The design of Lemmy was quickly laid out over a couple of hours. A project would be tied to a playbook which could contain one or multiple containers for deployment. Within the project definition additional per environment configuration information could be provided that would be applied to application configuration files that were handled by templates within the roles. The developer would use the web interface to trigger the deployment of a specific build to an environment, while a freshly booted instance would invoke a webhook with parameters based on instance tags to trigger the deployment to itself.

At a high level the workflow at this point looked like:

  1. Build system creates containers and thereafter pushes to repository then triggers webhook to Lemmy with project, short sha, and branch.
  2. Lemmy stores build information in Google Datastore and presents a web interface to the developer listing recent builds and environments available.
  3. Developer clicks a button to deploy revision of code to environment.
  4. Ansible playbook for project is run using the Ansible Python libraries, and hosts tagged to match receive the new container.

If an instance booted or autoscaled up for that project:

  1. Instance would ping the webhook on Lemmy with instance-id, project, and environment.
  2. Ansible playbook for that project would be run with the latest build of the application being deployed to that instance.

At the same time that Lemmy was being built, introduced to the platform team, and put to task, a parallel track of standing up a Kubernetes cluster to have applications deployed to it proceeded.

Deploying to Kubernetes:

As the creation of the Go services and containerization were progressing along, Kubernetes clusters were being built and destroyed to test out different setups for our use cases and any best practices we could come up with. The SRE team identified logging, labeling, monitoring, and naming things (alas, no cache invalidation) as the primary functions we would be very opinionated on. This opinionation easily transferred over into the deployment tool design in what was exposed to the developer versus what was abstracted away from them and controlled by SRE.

Already having projects in Lemmy, we added a configuration flag to specify whether it was an Ansible or Kubernetes service and mapped a Project to a Deploy which would define a Replication Controller and a Horizontal Pod Autoscaler. This is where the naming opinionation was put into play. With the name of the Project (e.g., hello-world, user-service) we would also name any Service, ConfigMap, or Secrets file with the same name. In our base Docker image the entrypoint script would then check to see if there were directories mounted, as defined in the Replication Controller, for secrets to inject into the environment via a directory named after the project (e.g., /secrets/hello-world, /secrets/user-server). Developers could now also reference any configuration files from a known directory (e.g., /config/hello-world/app-config.json) or reference other in cluster applications by Project name (e.g., http://hello-world, http://user-service). At the Service level SRE came up with the standards for port 80 being the gRPC gateway port, port 8080 being the REST Gateway port and 9090 being the debug/healthcheck/expvars/prometheus/metrics port. This was the base of our configuration.

Individual environment configs were implemented as a top-level JSON configuration item which would map to the Kubernetes namespace that the service was to be deployed to. These usually consisted of environment variables that the service would use for command line parameters for configuration upon startup. If needed, the number of running replicas and Kubernetes resource request and limit numbers could also be adjusted here. Upon a deployment this JSON is passed to a Jinja2 template to produce more JSON.

The Jinja2 template that the environment configuration JSON is passed to contains common information that is shared amongst all environments, such as container or containers to be run in the Pod, liveness and readiness probes, common environment variables, expvar data to be scrapped by Datadog, or annotations to be added for services like Prometheus in order to scrape data.

With the final JSON configuration generated, this is applied to two additional Jinja2 templates that generate the Replication Controller and Horizontal Pod Autoscaler JSON that is sent to the Kubernetes cluster via pykube to deploy the service.

After the deploy action the developer lands on a status page that shows details about the deployment, the pod or pods for that deployment and the state that they are in (e.g., pending, running, or crash loop backoff). This status page will also display any previous deploys and once the new version is running and deemed acceptable the previous version can be zeroed out or deleted at which point the new one will respond to all requests.

Once the deployment is completed and running, the opinionated aspects of the tool take over. Monitoring, logging, and metrics all flow through the appropriate systems tagged with project name, environment information, and other standard data so it is easy for the development and SRE teams to find information and troubleshoot if necessary.

Overall this deployment conduit has worked well between the platform development team and the serving infrastructure stood up by the SRE team. We learned from this the importance of having a strong opinionated configuration and template structure in the tool and naming scheme for the platform overall. With the template and configuration structure the development team knew exactly how to define their metrics to be collected and where to go for configuration information at runtime that was injected into the container. The SRE team new where to extract those metrics from and in what format to send off to monitoring and display systems. Specifically naming services in an opinionated way allowed easy access to metrics and log entries in the logging infrastructure for debugging purposes. In the future we would plan to support federated Kubernetes clusters, Stateful Sets and enhanced canary testing.