Whack a Pod Update
A few people have asked me about getting Whack-a-Pod to run on Minikube. To do that I needed to make a number of changes to various parts of the project. As it was, Whack-a-Pod assumed that you would have access to Container Engine, Container Builder, and Container Registry. I assumed that you need a lot of those features to run it and I know I can’t count on all of that in Minikube. So stuff had to be changed. But while I was changing it, I took the opportunity to make some more tweaks…
Change 1: Eliminate YAML Files
All of the configuration for Whack-a-Pod was stored in YAML files. This certainly makes kubectl easier to deal with, as you don’t have to learn all of the permutations of it. You just have to remember kubectl apply -f filename. This was great, but it led to a situation where to deal with variable project settings, I was having users create a number of yaml files dynamically during the setup, instead of just using the variables with kubectl commands right away.
Added to that, I run a few different clusters. Switching between clusters meant I had to recreate configs every time. And if I forgot to do it, I screwed up other clusters. Moving the configuration from the yaml files to the settings in kubectl commands called from a Makefile enabled me to take advantage of variables in the Makefile rather than rebuilding yaml files.
This change eliminated lots of Makefile code and yaml. This led to less system changes leaking into the code base, which would then need to be updated into an image, then pushed to Container
Change 2: Switch from Service — Load Balancers to Ingress
One of the things I added to Whack-a-Pod in reaction to feedback received after running it at demo booths was a way to show off some of the plumbing internals to facilitate deeper conversations about Kubernetes. I made an advanced view where we showed off the underlying nodes, added the ability to kill and resurrect nodes (actually cordon and uncordon, since killing nodes would take a long time to rebuild relative to the game length), and included a way to show which of our pods was actually servicing the request. One thing that became evident immediately was that I did not understand how things were being routed. I expected the answering pod to bounce all over the place, but it didn’t — it stayed pretty consistently sticky to the last pod that had answered the request.
Kelsey Hightower was in the office one day and I asked him about it. As Kelsey does, he gave me a very well thought-out and extremely technical explanation of why it was happening and a prescription: switch to Ingress. I was able to hold on to this second part.
Switching to Ingress on Container Engine was actually pretty easy. This is one of those places that Container Engine shines doing systems work for your cluster is just simple.
- Change Services from LoadBalancers to NodePort
- Request a global IP for the Ingress to use as its public IP.
- Create an Ingress resource on Kubernetes that points Services to routes.
I had a little difficulty getting my container ports, service ports, node ports, and Ingress ports along with my container paths to line up. Mostly because I was using ports 8080 on the containers and outputting to 80 but the docs were 80 all the way through. Things were just similar enough to cause me issues. But after a little trial and error (and RTFM), I got it up and running. This had two big pluses:
- The behavior of the answer pod changed to work as I expected, having it bounce among all the pods.
This had one drawback, it added a yaml file to the setup since Ingress is not available via direct kubectl options, other than kubectl apply. Small price to pay for what it bought me.
Change 3: Make Containers Canonical
Change 4: Shrink Containers
In my first build of the various Whack-a-Pod apps I wrote the admin and color api components in PHP and the game/ui in HTML/JS/CSS. For my base container image I used Google Cloud Platform’s App Engine PHP Flexible Docker image. It’s pretty easy to get started with, has GCP’s SDK already on it, and I know and trust the author Takashi Matsuo. This allowed me to build everything super fast. The downside is that the images ended up being pretty big:
- game 181 MB
- api 171 MB
- admin 171 MB
- total 523 MB
Considering that PHP’s canonical Docker images clock in in the 150–160MB range, 170–180 isn’t that bad. But it still meant that Whack-a-Pod took over half a gig to run. I figured I could do better. Not just for the sake of just getting a smaller number, but more importantly, smaller images mean faster deploys and probably an easier time of running Whack-a-Pod on Minikube. (Because these Minikube machines might be getting set up at conferences where Wi-Fi is frequently craptastic.)
First I took on the game component. The game is all HTML/JS/CSS. Since there’s not a lot to it, I figured just a plain NGINX server would handle it just fine and it did. This took the game container from 181 MB down to 49 MB. Not bad.
Next I took on the admin service. Both the admin and the api services are just APIs — either or both could be rewritten in Go. Doing so would allow me to remove numerous dependencies. In fact, I should be able to run these in a minimal Docker container based on the Scratch image. Then you just cross compile a Go executable and attach. Check out Nick Gauthier’s guide to Building Minimal Docker Containers. Rewriting the admin service in Go also meant that I could take advantage of the client-go for Kubernetes. I expected it to be super simple to write (which ended up being wrong).
I got it completely rewritten using the client-go for Kubernetes. This brought my image down to 4 MB. This version worked when I tested each operation in isolation. But when I went to run the game I started to get weird errors on the Ingress’ load balancer. Most of them translated to “the backend is timing out”. I fiddled around with client-go’s settings but when I started tweaking the QPS settings, I ran into errors in dependent libraries. It turns out the repeated polling I was doing was incompatible with the settings for throttling the client.
I got the impression that client-go was too smart for me. It was doing some sort of rate limiting and the documentation wasn’t that discoverable — nor stackoverflowable (is that a word?). I thought long and hard about it and decided to bypass client-go and write my own client for the Kubernetes API — figuring at least this way I would understand it completely. I just had to translate the PHP code I had written for the previous version into Go. This ended up being much easier and faster, and since it didn’t depend on client-go, it was also lighter at container-building time. My admin container went from 4 MB to 2MB.
Finally, I took on the API service, which was pretty simple. It had two endpoints: one just gave a random color, the other spit out a JSON object with a random color and the hostname, which on Kubernetes ends up being the pod name. This one was super simple, and took the API container from 171 MB to 2MB.
To sum up, by switching to the simplest images I could to get the job done (base NGINX in one case and a Go executable running on the scratch Docker image) I was able to achieve the following reduction in Docker image size:
- game 181 MB -> 49 MB
- api 171 MB ->2 MB
- admin 171 MB ->2 MB
- total 523 MB ->53 MB
Change 5: Minikube and Xhyve
I was prepared to have to do a lot of work to get Whack-a-Pod to work on Minikube. As it turns out, it was actually really easy to do. I had to:
- Install Minikube
- Enable Ingress
- Create a copy of the Ingress yaml I already had:
Without GKE annotations
With a reference to the host “wap.io”
- Edit /etc/host to point Minikube’s externally available IP address to wap.io
I got it running, but I ran into a little snag. The performance was a bit slow. You could easily take down all of the pods in the cluster, and they would stay down disrupting the service. I asked around and my coworker Ahmet Alp Balkan told me about xhyve. Xhyve is a hypervisor that makes it possible to run Linux on a Mac. It can be used with Minikube and it sped up pod deployment by 400%. With xhyve the performance was now fast enough to provide a reasonable game experience for Whack a pod.
I had to alter the UI to handle a slight difference with Minikube. The full advanced UI allows you to kill a node. (Well not really, you can unschedule a node and then kill all the pods on it). But since Minikube is a one-node cluster, you probably shouldn’t kill it — as the rest of the app requires a node to run on — so I disabled this capability in the UI.
What started with a goal — run Whack-a-Pod on Minikube — ended with a little bit of yak shaving. But all of the changes seem worth it. I hope the updates make it easier for you to try Whack-a-Pod out for yourself. All of the changes are currently available at Whack a Pod on github.