Building for Scale: Choosing the right init system for Docker containers

Learning Fun
Atom Platform
5 min readMay 15, 2020

--

At Kaplan, we are working on a multi-tenant platform for Learning & Assessment called Atom. Atom supports a variety of business use cases, both for Kaplan and external customers. Building for scale is a crucial part of our strategy, and we are always looking for ways to improve performance. Our focus on scaling has required us to look at several big and small changes that have benefited the platform. In this blog post, I focus on choosing the right init system for our docker containers, which ended up being a relatively low-cost, high-impact change that helped our scalability significantly.

Business Context

The Atom platform is designed to serve users with a variety of needs, including students to content creators and educators. At the heart of the platform, we have our Assessment engine, which serves as the system of truth for student performance. The assessment engine is responsible for handling the assessment of learners across a range of learning objectives in order to improve student outcomes and overall performance. Architecturally, the responsibility of our Assessment engine is to run psychometrically driven business logic by using specialized APIs that use a wide variety of AWS Services like ElasticSearch, Kinesis Data Stream, RDS, Elastic Cache and S3. We deploy these microservices as a group of dockerized applications written in NodeJS, and Restify, a popular framework for building Node applications.

Assessment Engine

Init Process inside Docker Containers

At Kaplan, Docker containers are integral to our SDLC process. We use them to develop applications, run unit tests, and deploy on production environments. Single process per container is our recommended design pattern for the docker applications. It has worked out really well because it is easy to scale containers horizontally, promotes reuse, and makes it easy to troubleshoot. However, our Node applications often use external npm packages that spawn child processes, which makes it very crucial to have a proper Init Process. In Unix-based system it is the first process during bootup, and all other processes are directly or indirectly the child process of this parent process. These child processes initially become defunct (ie. Zombie_process) before being removed from the process table. Zombie process is a child process that has completed the execution, but still has an entry in the process table until its status is read by the parent process. When the waited status is read, the zombie process is removed from the process table, and is said to be reaped. The child process which is present in the process table but its parent died, becomes orphaned process. These orphaned processes are adopted by the Init process. In the absence of proper init system, these orphaned processes waste server resources and results in system performance degradation.

Assessment Engine and Init process

Our Assessment engine containers run Node application using the supervisord process manager. Supervisord is a client/server system that allows us to monitor and manage a number of processes on a Unix based operating system and not meant to be a substitute for the Init process. I wanted to understand how not having a proper init process was affecting our Assessment engine performance.

As a baseline, some load tests were executed using a few transactions as indicated in the table below. With the goal of reducing response times and increasing throughput, I looked at Kibana logs, New Relic and other AWS monitoring tools to see what was going on under the hood. Several errors were noticed like 504 gateway timeout, 500 Internal server errors causing failures and also contributing to increased response times.

We stimulated a quick 1300 concurrent users taking the assessment on our platform for 1 hour period for this baseline test. The below table shows some of the major transections sent to the assessment engine during 1 hour of time and 90th percentile of the response time, i.e. 90 percent of requests were finished within this time when all the requests are sorted in ascending order of response time. There were couple of more transactions but for the purpose of comparison they were omitted as the number of requests were smaller.

Transactions with Supervisord
Transactions with Supervisord

I got even more curious about the Supervisord process. I read more articles and found out that there are a number of other process managers available -monit, runit, s6, tini, and dumb-init. This article was very helpful in understanding the comparison between different process managers.

Dumb-init caught my attention for the following key reasons:

  • Lightweight and fast init system written in C
  • Container optimized
  • Runs as PID 1 and immediately spawns child processes as PID 2
  • Handles and forwards signals promptly as they are received

After more research, I decided to do a proof of concept (POC) with dumb-init.

Dumb-init POC

A custom docker base image was created for the applications from node:12-alpine and then dumb-init was installed by downloading binaries directly (one of the recommended ways of installation mentioned in their git repo). The Dockerfile example with dumb-init can be downloaded from my github gist. This custom base image was updated in all the APIs in the Assessment engine and deployed it to our load test environment. After a couple of rounds of load testing with different numbers of concurrent users (ranging from 1300 to 3000), We saw that the results were amazing! The CPU and memory utilization was less than the previous baseline and the response time was better too.

The table below shows the results for the same transactions after switching to dumb-init. We stimulated 2400 concurrent users taking the assessment on our platform for 1 hour period.

Transactions with Dumb-init
Transactions with Dumb-init

It was very exciting to see that not only our platform supports a significantly higher number of requests, but also the response times had reduced significantly by 35%-62%.

After completing the POC and seeing good results, I presented this to Architecture Review Board (ARB) and quickly got the buy-in to move forward.

This was a great example of how a simple configuration or a utility switch (like the init process in this case) can result in significant performance improvements in a distributed system using containers. Good things can come in small packages!

On a personal note, I have learned a lot at Kaplan and I am thankful to my seniors and colleagues for their ongoing support and help in building a modern and scalable platform. Stay tuned for some more interesting blogs from me and my colleagues.

--

--