GKE startup time & the container problem
Last time, I covered the various ways to profile & get visibility into Google Container Engine (GKE) startup time for a given pod. This was extremely helpful, considering how complex Kubernetes can be during boot phases, that it can sometimes hide performance issues in weird places.
Inadvertently, this was where Bacon, Live!’ was running into a problem : Even with the awesome power of kubernetes, the containers for their pod still needed to pull down some data, and it turns out that their initialization dance was causing heartburn. (get it? Cause it’s a food application? …)
Depending on containers
Bacon, Live! Had ported their code over from GCE to GKE, and as such, all of their containers initialized in the parallel phase of the pod boot process (FWIW: they did have a single init container, that was doing some “sorry, can’t tell you about that” work that was taking about 2–3 seconds, but I’ll get to that in a minute..) . Applying the profiling process from the last article, we ended up getting the following timings for their containers:
Three of the containers were taking a significant amount of time to boot up, most important was the public / admin endpoints, which were directly used during the health check phase.
Looking at the public endpoint, I saw what looked to be the culprit:
The initialization code for the public endpoint was doing a quick call over to the Stats container, to make sure it was up and running before proceeding forward. Ah, now we see the problem.
That urlfetch command is a blocking call. Since the parallel containers are initialized at (pretty much) random, the endpoints were getting initialized first, but were getting blocked on their dependency to the stats container. Obviously not ideal..
Let’s use init containers!
Since it looked like we had a dependency on the stats container, it would make sense to move that container into the init phase, so that the parallel containers didn’t have to get caught in this weird race condition, and could execute properly. We can see how the system responded; basically the init phase now got 2–4 seconds longer, which resulted in the entire system startup being worse off in general:
(remember, their existing “can’t tell you about that” init container was already doing about 2–2.5 seconds of work, so we saw a doubling of this process)
Nevermind init containers!
This was obviously the wrong direction; this made the whole cold-boot time slower.
At this point, Bacon, Live! Decided to go back through, and make some small changes their code: the stats module was there to help do timings of startup related tasks, but since they now had visibility to those things outside of their code (thanks to the profiling harness we’d set up) they were happy to remove the dependency for the endpoints.
The result is that we end up dropping some signals on the ground, but with my new startup-testing code, that information wasn’t as important to keep around. Once this change was made, the stats container was fine to be moved back during the parallel phase, improving the whole system performance:
Findings and take aways
Here’s a couple things I took away from this:
Timing the startup time in containers is somewhat complicated, but important to do.
To really get this information in a sane way requires you to modify each individual container in very specific ways that require knowledge about the Pod that it’s running on. I’m sure there’s a dev-ops reason for why this is a good thing, but I don’t have visibility to that yet.
The init containers are critical path. Since the containers initialize in linear order, the time spent executing one can delay the startup time of others. A single bad-actor can invoke significant burdens to startup time. As such, there’s a significant tradeoff you should consider with respect to how much work executes in the init phase of the containers, vs later on, in the parallel phase. (Note, Kelsey Hightower has a great article on navigating this complexity, if you’re interested)
Avoid parallel init dependencies. These containers are created in random order. Avoid linear dependencies between them. Having container A need Container B could end up a problem if A gets initialized before B, thus A ends up waiting a random amount of time. If possible, consider moving these dependencies to the init phase, or removing the dependency completely.