Debugging running pods in GKE clusters
For some time now Kubernetes supports ephemeral containers. Starting from Kubernetes version 1.18 the ephemeral pods can be used to debug running pods in addition to a large set of other troubleshooting methods. While GKE already supports Kubernetes 1.18 the kubectl debug
command is still unavailable. Mainly because this feature is still marked as Alpha in Kubernetes API. So, what else can you do beside inspecting GKE application logs and traces?
It is possible to access to running pod’s containers from the hosting VM. In GKE most of clusters use COS to run worker nodes. When you SSH’ing to the node you still lack root access as well as many useful utilities that you would need for debugging. In COS you can use the COS toolbox to debug your running pods. The toolbox is initially created to debug the node issues but can be easily converted to the running pod debugging tool. For example, if you need to capture the traffic coming from your pod, do the following:
- SSH into the node where the pod runs (use
kubectl get po -o wide
to see the node name). - Run toolbox.
- Install and run tcpdump to capture all packets with source equal to the pod’s IP.
- Copy the dump from the node to your workstation.