Automated Troubleshooting of Kubernetes (K8s) Pods Issues

Automatically Collect K8s Pod Restart Reasons, Logs, and Events

Photo by Carles Rabada on Unsplash
A brief alert message with Show more

Background

Troubleshoot Pod Issues

$ kubectl get pod demoservice-56d5f9f7ff-slr7d
NAME READY STATUS RESTARTS AGE
demoservice-56d5f9f7ff-slr7d 1/2 Running 2 164h13m57s
$ kubectl describe pod demoservice-56d5f9f7ff-slr7d
...
Ready: false
Restart Count: 2
State:
Running
Started:
Wed, 10 Aug 2022 02:34:48 +0000
Last State:
Terminated
Reason: OOMKilled
Exit Code:
137
Started:
Mon, 08 Aug 2022 07:28:33 +0000
Finished:
Wed, 10 Aug 2022 02:34:46 +0000
Limits:
cpu:
1
memory:
1Gi
Requests:
cpu:
20m
memory:
500Mi
...

Automation to the rescue

An expanded alert message showing full detail

Method #1: Writing a Collector with Bash Script and Kubectl

Method #2: Writing a K8s Custom Controller Using client-go Library

Summary

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Able Lv

Cloud Infrastructure Engineer @Airwallex: Kubernetes, DevOps, Terraform, Istio, Go, and Cloud-Native stuff