Health Checkups for your OpenShift Cluster

Pradipta Banerjee
DevOps Learners
Published in
3 min readApr 14, 2022

--

Health checks for your OpenShift Clusters

If you are responsible for managing Red Hat OpenShift clusters, then there is a nifty opensource project (aptly named openshift-checks) that could be a good addition to your toolkit.

The openshift-checks project is a collection of health-check scripts for an OpenShift cluster.

Take a look at the project description and related details from the following link:

Running health checks

The easiest way is to run it as a container.

Remember to provide the path to the kubeconfig file. In the example below, the kubeconfig file is located at “$HOME/kubeconfig”

$ alias openshift-checks="podman run -it --rm -v $HOME/kubeconfig:/kubeconfig:Z -e KUBECONFIG=/kubeconfig quay.io/rhsysdeseng/openshift-checks:latest"

Take a look at available checks.

$ openshift-checks -l
Available scripts:
[..snip...]
checks/nodes
checks/notrunningpods
checks/operators
checks/ovn-pods-memory-usage
checks/pdb
checks/port-thrasing
checks/restarts
[...snip...]

Executing the command without any options will run all the checks. It’s also possible to execute specific checks.

The following example runs single info (command) check to display the cluster version.

$ openshift-checks -s info/00-clusterversionUsing system:admin context
Cluster version:
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.10.0–0.nightly-2022–03–11–203311 True False 12m Cluster version is 4.10.0–0.nightly-2022–03–11–203311
No issues found

You can also extend it by adding your own specific health checks.

Adding new health checks

Adding a new health check consists of two primary steps:

  1. Clone the source tree
  2. Add script to a specific folder (info, checks, etc)
  3. Build a new container image (Optional)

Let’s see this in action by way of an example. We’ll be adding a simple info (command) check for Red Hat OpenShift sandboxed containers operator.

Clone the source tree.

$ git clone https://github.com/RHsyseng/openshift-checks.git
$ cd openshift-checks

Add health check script.

$ cat > info/06-sandboxed-containers <EOF#!/usr/bin/env bash[ -z ${UTILSFILE} ] && source $(echo “$(dirname ${0})/../utils”)if oc auth can-i get pods -n openshift-sandboxed-containers-operator >/dev/null 2>&1; then
msg “Sandboxed containers operator:\n$(oc get pods -n openshift-sandboxed-containers-operator)”
exit ${OCINFO}
else
msg “Couldn’t get sandboxed containers operator details, check permissions”
exit ${OCSKIP}
fi
exit ${OCUNKOWN}
EOF

Build a new container image.

$ podman build -t bpradipt/openshift-checks .

Verify

$ alias openshift-checks="podman run -it --rm -v $HOME/kubeconfig:/kubeconfig:Z -e KUBECONFIG=/kubeconfig bpradipt/openshift-checks:latest"$ openshift-checks -l
[...snip...]
info/04-machineset
info/06-sandboxed-containers
[...snip...]
$ openshift-checks -s info/06-sandboxed-containersUsing system:admin context
Sandboxed containers operator:
NAME READY STATUS RESTARTS AGE
controller-manager-676fccfcff-j2trz 2/2 Running 0 14d
No issues found

There are numerous possibilities. You can use this to run a specific set of checks post-deployment to ensure the cluster is set up as per your requirements and ready to handle workloads, or you can run some periodic checks as needed.

--

--