How to Inject Environment Variables into Kubernetes Pods after Deployment

An example using the native “kubectl set env” command to make Python’s buffering more responsive on a deployed service running on Kubernetes.

Tom Szumowski
4 min readMar 11, 2019
Photo by Joshua Sortino on Unsplash

In this article we’ll cover how to inject environment variables into deployed Kubernetes applications using the kubectl set env command.

What happened?

I recently started experimenting with deploying Python web applications on Kubernetes. In one scenario, I had Python-based (Flask) web application deployed on a pod that wasn’t behaving as I expected. It was supposed to be storing elements to a database, It was also supposed to print to stdout during regular operations and stderron error events. However when I inspected the pod with kubectl logs -f [POD], it was just a blank console.

I was able to open up Google Stackdriver and identify the error message:

File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

It appeared any time my service received a request, it would raise that exception and Kubernetes was immediately restarting the pod. (Note: it was a bug on my end due to improper JSON handling, but that’s unrelated to this article).

Why didn’t I see it in the logs?

My application had log statements that would have provided additional information, but they weren’t being flushed before reset due to the environment variable PYTHONUNBUFFERED being set to the default False. PYTHONUNBUFFERED forces the stdout and stderr streams to be unbuffered, which provides console output immediately. Some ML applications, such as Amazon SageMaker’s template, set this to True. We intentionally set it to False so that the pods weren’t bogged down in constantly flushing stdout when under heavy load (though this benefit may be debatable.

How to force buffering in a running pod? — Invasive approach

I was interested in forcing the python buffer to flush within the pod. This StackExchange answer provided an interesting method of using gdb to attack to the existing Python process on the pod to make modifications such as force a buffer flush. I didn’t have access to gdbin my pod, nor did it have sudo access to install it while running via kubectl exec -it [PODNAME] /bin/bash (rightfully so, for safety). I called this the “invasive” approach because it is actively modifying execution.

How to temporarily force buffering via environment variable updates? — Less invasive, but still invasive

Instead of forcing a flush, perhaps we can set PYTHONUNBUFFERED=1 environment variable and reset the pod so that it picks it up on reboot. That way I can see the log output temporarily.

Options

There are a few ways to do this:

  1. Build a new Docker image. This requires a full redeploy into the cluster, in this case unnecessary for this quick dev debugging.
  2. Define environment variable in YAML configuration. This also requires a full redeploy, or at a minimum a kubectl patch
  3. Temporarily set environment variable with kubectl set env

Here we’ll walk through an example using #3. Kubernetes has a built in tool kubectl set env (docs) which allows you to add or remove environment variables to:

  • pods,
  • replicationcontrollers,
  • deployments,
  • daemonsets,
  • jobs, and
  • replicasets.

The help command kubectl set env -h is helpful and gives about ten different examples.

In my case, I changed the ReplicaSet. The — list — all command was helpful for figuring out options. For example, kubectl set env rs — list — all listed out all replicasets and any environment variables that were assigned to them.

Example temporary environment variable setting procedure

  1. Add the environment variable. I added the environment variable with: kubectl set env rs [REPLICASET_NAME] PYTHONUNBUFFERED=1.
  2. Terminate the running pod(s). That way the next pod created would pull in that environment variable.
  3. Wait for pod to start.
  4. View logs with kubectl logs -f [POD] . After refreshing, I was seeing logs, and identified that it was dying due to an attempt to execute `json.loads` on a non-JSON string. This helped me isolate the error based on the input to the pod’s application.
  5. Remove the environment variable. Once complete, I removed the environment variable with: kubectl set env rs [REPLICASET_NAME] PYTHONUNBUFFERED-,
  6. Terminate the running pod(s) again.
  7. Confirm everything healthy.

Again, I don’t advocate doing this all the time. It’s useful for quick debugging in isolated scenarios.

  • Never do it on a production system.
  • Never even do this on a dev environment without taking care in how it may impact your deployment workflow.
  • Take advantage of other logs (in my case Stackdriver) to get more information before anything else.

But this was a fun little discovery to temporarily tweak a deployment to get more information from misbehaving applications.

--

--

Tom Szumowski

URBN Data Scientist, Machine Learning Enthusiast, Coffee Snob, Geocacher, & Engineer. Currently out exploring ML deployment best practices & data engineering.