How to Inject Environment Variables into Kubernetes Pods after Deployment
An example using the native “kubectl set env” command to make Python’s buffering more responsive on a deployed service running on Kubernetes.
In this article we’ll cover how to inject environment variables into deployed Kubernetes applications using the kubectl set env
command.
What happened?
I recently started experimenting with deploying Python web applications on Kubernetes. In one scenario, I had Python-based (Flask) web application deployed on a pod that wasn’t behaving as I expected. It was supposed to be storing elements to a database, It was also supposed to print to stdout
during regular operations and stderr
on error events. However when I inspected the pod with kubectl logs -f [POD]
, it was just a blank console.
I was able to open up Google Stackdriver and identify the error message:
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
It appeared any time my service received a request, it would raise that exception and Kubernetes was immediately restarting the pod. (Note: it was a bug on my end due to improper JSON handling, but that’s unrelated to this article).
Why didn’t I see it in the logs?
My application had log statements that would have provided additional information, but they weren’t being flushed before reset due to the environment variable PYTHONUNBUFFERED
being set to the default False
. PYTHONUNBUFFERED
forces the stdout
and stderr
streams to be unbuffered, which provides console output immediately. Some ML applications, such as Amazon SageMaker’s template, set this to True
. We intentionally set it to False
so that the pods weren’t bogged down in constantly flushing stdout when under heavy load (though this benefit may be debatable.
How to force buffering in a running pod? — Invasive approach
I was interested in forcing the python buffer to flush within the pod. This StackExchange answer provided an interesting method of using gdb
to attack to the existing Python process on the pod to make modifications such as force a buffer flush. I didn’t have access to gdb
in my pod, nor did it have sudo
access to install it while running via kubectl exec -it [PODNAME] /bin/bash
(rightfully so, for safety). I called this the “invasive” approach because it is actively modifying execution.
How to temporarily force buffering via environment variable updates? — Less invasive, but still invasive
Instead of forcing a flush, perhaps we can set PYTHONUNBUFFERED=1
environment variable and reset the pod so that it picks it up on reboot. That way I can see the log output temporarily.
Options
There are a few ways to do this:
- Build a new Docker image. This requires a full redeploy into the cluster, in this case unnecessary for this quick dev debugging.
- Define environment variable in YAML configuration. This also requires a full redeploy, or at a minimum a
kubectl patch
- Temporarily set environment variable with
kubectl set env
Here we’ll walk through an example using #3. Kubernetes has a built in tool kubectl set env
(docs) which allows you to add or remove environment variables to:
- pods,
- replicationcontrollers,
- deployments,
- daemonsets,
- jobs, and
- replicasets.
The help command kubectl set env -h
is helpful and gives about ten different examples.
In my case, I changed the ReplicaSet. The — list — all
command was helpful for figuring out options. For example, kubectl set env rs — list — all
listed out all replicasets and any environment variables that were assigned to them.
Example temporary environment variable setting procedure
- Add the environment variable. I added the environment variable with:
kubectl set env rs [REPLICASET_NAME] PYTHONUNBUFFERED=1
. - Terminate the running pod(s). That way the next pod created would pull in that environment variable.
- Wait for pod to start.
- View logs with
kubectl logs -f [POD]
. After refreshing, I was seeing logs, and identified that it was dying due to an attempt to execute `json.loads` on a non-JSON string. This helped me isolate the error based on the input to the pod’s application. - Remove the environment variable. Once complete, I removed the environment variable with:
kubectl set env rs [REPLICASET_NAME] PYTHONUNBUFFERED-
, - Terminate the running pod(s) again.
- Confirm everything healthy.
Again, I don’t advocate doing this all the time. It’s useful for quick debugging in isolated scenarios.
- Never do it on a production system.
- Never even do this on a dev environment without taking care in how it may impact your deployment workflow.
- Take advantage of other logs (in my case Stackdriver) to get more information before anything else.
But this was a fun little discovery to temporarily tweak a deployment to get more information from misbehaving applications.