Sitemap
Airwallex Engineering

Insights, ideas & learnings from the Airwallex engineering team. We’re a determined team of engineers passionate about technology & innovation. We build global financial infrastructure to scale the digital economy. Join us at: https://www.airwallex.com/careers

Follow publication

Automated Troubleshooting of Kubernetes (K8s) Pods Issues

Able Lv
4 min readAug 13, 2022

--

Photo by Carles Rabada on Unsplash
A brief alert message with Show more

Background

Troubleshoot Pod Issues

$ kubectl get pod demoservice-56d5f9f7ff-slr7d
NAME READY STATUS RESTARTS AGE
demoservice-56d5f9f7ff-slr7d 1/2 Running 2 164h13m57s
$ kubectl describe pod demoservice-56d5f9f7ff-slr7d
...
Ready: false
Restart Count: 2
State:
Running
Started:
Wed, 10 Aug 2022 02:34:48 +0000
Last State:
Terminated
Reason: OOMKilled
Exit Code:
137
Started:
Mon, 08 Aug 2022 07:28:33 +0000
Finished:
Wed, 10 Aug 2022 02:34:46 +0000
Limits:
cpu:
1
memory:
1Gi
Requests:
cpu:
20m
memory:
500Mi
...

Automation to the rescue

An expanded alert message showing full detail

Method #1: Writing a Collector with Bash Script and Kubectl

Method #2: Writing a K8s Custom Controller Using client-go Library

Summary

References

--

--

Airwallex Engineering
Airwallex Engineering

Published in Airwallex Engineering

Insights, ideas & learnings from the Airwallex engineering team. We’re a determined team of engineers passionate about technology & innovation. We build global financial infrastructure to scale the digital economy. Join us at: https://www.airwallex.com/careers

Able Lv
Able Lv

Written by Able Lv

Cloud Infrastructure Engineer @Airwallex: Kubernetes, DevOps, SRE, Go, Terraform, Istio, and Cloud-Native stuff

Responses (8)