Access PVC Data without the POD; troubleshooting Kubernetes.

Richard Durso
6 min readJul 22, 2022

--

Introduction

I recently had a situation where Prometheus was stuck in a crash loop and unable to start. The solution is to delete a file within the Persistent Volume Claim (PVC). Seemed simple enough, however with the pod in a crash loop the PVC was not mounted within the Prometheus container. How can I deleted the file?

Photo by Ramón Salinero on Unsplash

The goal of this article is to explain how I resolved this issue and show you how to find and reach the contents of PVC volumes when the respective pod is unavailable.

The Issue

Sometimes interruptions between Prometheus and the Persistent Volume Claim (PVC) can lead to an empty or corrupt chunk file. This results in Prometheus being stuck in a crash loop. I didn’t know that yet, what I did know was that Prometheus was down and could not start:

$ kubectl get pods -n monitoring

NAME READY STATUS RESTARTS AGE
alertmanager-0 2/2 Running 0 6d20h
grafana 3/3 Running 0 6d20h
state-metrics 1/1 Running 0 6d20h
operator 1/1 Running 0 61m
node-exporter 1/1 Running 3 (6d20h ago) 34d
node-exporter 1/1 Running 2 (6d20h ago) 34d
node-exporter 1/1 Running 3 (6d19h ago) 34d
prometheus-0 1/2 CrashLoopBackOff 12 (3m58s ago) 40m

NOTE: Output above significantly modified to fit! I will use simpler pod names.

As can be seen above the Prometheus pod is stuck in state CrashLoopBackOff and had tried to restart 12 times already.

This article assumes Prometheus is installed in namespace monitoring. While this is common, its not always the case. You may have to adjust steps here to match your setup. Also this Kubernetes installation was using containerd, a docker based deployment could be different.

Next logical step is to review the logs for a reason why.

$ kubectl logs prometheus-0 -n monitoringcaller=main.go:874 level=info msg="Scrape manager stopped"
caller=main.go:841 level=info msg="Notify discovery manager stopped"
caller=notifier.go:599 level=info component=notifier msg="Stopping notification manager..."
caller=main.go:1103 level=info msg="Notifier manager stopped"
caller=main.go:1112 level=error err="opening storage failed: /prometheus/chunks_head/000760: invalid magic number 0"

The important part is that last line of the log file. We know we have an error. It’s unable to open storage. It provides the full path and filename of the file in question and it doesn’t believe in magic today (actual meaning is not relevant to this article).

The Solution

According to the Prometheus GitHub Project the solution to this problem is to locate and delete this file as it is corrupt. Prometheus will remain down until you delete this file.

One could assume that there is a setting to automatically delete/rename a corrupt file to get Prometheus up and running quickly, but apparently this is not a feature yet.

It’s straightforward to get an interactive shell with Prometheus to try to delete the file:

$ kubectl exec -it prometheus-0 -n monitoring -- /bin/sh

Normally this will place you in the /prometheus directory which is the database storage (the PVC). However, due to the pod being in stuck in CrashLoopBackOff state the PVC was not mount and it dumped me in the / (root) directory of the pod. Without the PVC mounted there isn’t a way to delete the file.

I really wish it had that auto-delete feature. After a bit more research and reading previous related issues, I found several methods for getting at the PVC contents.

The Workaround

Each method I reviewed had roughly the same goal but the detailed steps seemed to depend on the type of PVC storage used and how it is mounted on the node. I’ve tried to reduce the number to steps and make it more generic. Be aware there is not a one-size-fits-all solution for this.

Determine the Prometheus Pod Host

First task is to determine which node in the cluster the Prometheus pod is running on. You could simply use kubectl get pods -n monitoring -o wide to determine this but that output will not fit here. So instead, I’ll use custom-columns for some fun:

$ kubectl get pods -n monitoring \ 
-o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName

NAME NODE
alertmanager-kube-prometheus-stack-alertmanager-0 k3s02
kube-prometheus-stack-grafana-5c86b6f7cd-mtwlb k3s02
kube-prometheus-stack-kube-state-metrics-54fd8f6c5b-ntj4v k3s02
kube-prometheus-stack-operator-6bd56776b9-6bj4j k3s01
kube-prometheus-stack-prometheus-node-exporter-4rk4j k3s02
kube-prometheus-stack-prometheus-node-exporter-xvm92 k3s03
kube-prometheus-stack-prometheus-node-exporter-zvpfw k3s01
prometheus-kube-prometheus-stack-prometheus-0 k3s03

As can be seen the real names of the pods just barely fit! More importantly we can see that Prometheus is running on node named k3s03.

Determine the Persistent Volume Mount Point

Most of the guides I found got lost in a bunch of kubectl get and kubectl describe commands trying to locate where the PVC might be. I think its really more about what are the keywords that might be in the generated name. I started with prometheus and pvc and it worked on the first guess:

$ findmnt --raw --output=target | grep prometheus | grep pvc

/var/lib/kubelet/pods/3bcb4bbb-ed94-4858-92aa-734d285c1f10/volume-subpaths/pvc-69940283-6e0a-4ff3-b2ff-98d08931c3f0/prometheus/2

The Linux findmnt command can be used to search for mount points or display a tree structure of all mounted filesystem. It is the proper command to use instead of mount | grep combination as its output is easier to read. In this case either method would work fine.

Not all Storage Provisioners actually use the letters “pvc” in generated names. If you find nothing then just search for prometheus but this might find multiple matches, you will need to determine which is the database one.

$ findmnt --raw --output=target | grep prometheus

If for some reason you still find nothing, then get the PVC name and search for that:

$ kubectl get pvc -n monitoring

NOTE: This often shows PVCs for Prometheus Alertmanager, Grafana and the Prometheus DB (database) which is the one we are interested in. Repeat the above search commands with the value you have in the VOLUME name.

Become Root and Track Down the File

Now that we have an idea where the PVC is mounted on the node hosting the Prometheus pod we can navigate into it as root.

# Become root user:
$ sudo -i
# This is just one long line for the PVC mount name:
cd /var/lib/kubelet/pods/3bcb4bbb-ed94-4858-92aa-734d285c1f10/volume-subpaths/pvc-69940283-6e0a-4ff3-b2ff-98d08931c3f0/prometheus/2

We are now at the equivalent of the /prometheus directory that is reachable inside the pod. Based on the error message we had, we need to look in the chunks_head directory.

cd chunks_head# Find the bad file - "000760":
ls
000759 000760

There is the source of our trouble, a file named 000760. Let’s delete it.

# Delete the file:
rm 000760

Simply type exit to quit from the sudo shell and get out of the PVC area.

Force a Restart of the Prometheus Pod

The easiest way to force a restart of a Statefulset application is to scale the replicas to zero and then back to one.

$ kubectl scale --replicas=0 -n monitoring statefulset \
prometheus-kube-prometheus-stack-prometheus
statefulset.apps/prometheus-kube-prometheus-stack-prometheus scaled
$ kubectl scale --replicas=1 -n monitoring statefulset \
prometheus-kube-prometheus-stack-prometheus
statefulset.apps/prometheus-kube-prometheus-stack-prometheus scaled

Conclusion

I monitored the Prometheus log file after being restarted. Prometheus was able to complete its restart without issues. It did not report any additional corrupt file issues.

I hope you enjoyed this alternative way of getting access to data stored on a Persistent Volume without using the POD. I thought I was going to have to delete the StatefulSet, recover the released PVC in another test pod just to mount and access the volume to delete one file, recover the PVC again and recreate the StatefulSet deployment to get Prometheus back on-line. That would have been a lot of work. This alternate method was much simpler.

Typically the kubectl cp command is used to transfer files into and out of pods. Unfortunately the kubectl cp has a requirement of the tar utility being available within the pod. I’m wondering if this this method could be an alternate way to bulk load data into a pod… I’ll test this in the future.

--

--

Richard Durso

Deployment Engineer & Infrastructure Architect. Coral Reef Aquarist. Mixologist.