Kind — fix missing Prometheus Operator targets

Charles-Edouard Brétéché
5 min readFeb 4, 2022

--

In this story I’m going to talk about deploying Prometheus via prometheus-operator on a local Kubernetes cluster created with Kind.

Getting the metrics for all Kubernetes components is not going to work out of the box though, we’ll need to adapt Kind and prometheus-operator configuration a little bit.

In the next steps I’m going to cover the changes needed and why we need them, but let’s first deploy everything with default configuration to see what is failing.

Create a local cluster with Kind

Let’s start by deploying a simple cluster with 3 master nodes and 3 worker nodes:

kind create cluster --image kindest/node:v1.23.1 --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker
EOF

Please note that this cluster has nothing special, it will be created with default settings for all Kubernetes components.

Deploy kube-prometheus-stack

We can now deploy prometheus-operator using the kube-prometheus-stack Helm chart:

helm install --wait --timeout 15m \
--namespace monitoring --create-namespace \
--repo https://prometheus-community.github.io/helm-charts \
kube-prometheus-stack kube-prometheus-stack

This will install prometheus-operator, and spin up a Prometheus instance (and other components like grafana, kube-state-metrics, node-exporter, … but it doesn’t matter for the scope of this article).

The freshly created instance will be configured to scrape a bunch of targets, corresponding to Kubernetes components (etcd, controller-manager, api-server, scheduler, kube-proxy, …).

Observe missing targets

To connect to the running Prometheus instance we created in the previous step, we need to port-forward to the kube-prometheus-stack-prometheus service on port 9090:

kubectl port-forward -n monitoring \
svc/kube-prometheus-stack-prometheus 9090:9090

Opening a browser at http://localhost:9090/targets will list the registered targets and should look something like this:

Clearly, we have a problem with the following components:

  • kube-scheduler
  • kube-controller-manager
  • kube-proxy
  • etcd

For some reason Prometheus fails to connect to those targets. Next we will ssh on the control plane node and check the offending components configuration.

Look at the configuration on the control plane

We can connect to the control plane node with:

docker exec -it kind-control-plane bash

Once on the node, let’s inspect the command line used to start the processes we’re interested in:

$ xargs -0 < /proc/$(pidof etcd)/cmdline
etcd --advertise-client-urls=https://172.21.0.7:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://172.21.0.7:2380 --initial-cluster=kind-control-plane=https://172.21.0.7:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://172.21.0.7:2379 --listen-metrics-urls=http://127.0.0.1:2381 --listen-peer-urls=https://172.21.0.7:2380 --name=kind-control-plane --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
$ xargs -0 < /proc/$(pidof kube-scheduler)/cmdline
kube-scheduler --authentication-kubeconfig=/etc/kubernetes/scheduler.conf --authorization-kubeconfig=/etc/kubernetes/scheduler.conf --bind-address=127.0.0.1 --kubeconfig=/etc/kubernetes/scheduler.conf --leader-elect=true
$ xargs -0 < /proc/$(pidof kube-controller-manager)/cmdline
kube-controller-manager --allocate-node-cidrs=true --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf --bind-address=127.0.0.1 --client-ca-file=/etc/kubernetes/pki/ca.crt --cluster-cidr=10.244.0.0/16 --cluster-name=kind --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt --cluster-signing-key-file=/etc/kubernetes/pki/ca.key --controllers=*,bootstrapsigner,tokencleaner --enable-hostpath-provisioner=true --kubeconfig=/etc/kubernetes/controller-manager.conf --leader-elect=true --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --root-ca-file=/etc/kubernetes/pki/ca.crt --service-account-private-key-file=/etc/kubernetes/pki/sa.key --service-cluster-ip-range=10.96.0.0/16 --use-service-account-credentials=true
$ xargs -0 < /proc/$(pidof kube-proxy)/cmdline
kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=kind-control-plane

From the logs above, we can see that kube-controller-manager, kube-scheduler and etcd are listening on localhost. This explains why Prometheus fails to connect to those targets.

Nothing indicates what happens for kube-proxy but it boils down to the issue, in fact kube-proxy listens on localhost by default.

Configure Kind cluster components

Kind uses kubeadm to bootstrap a cluster.

Fortunately, we can provide custom ClusterConfiguration and KubeProxyConfiguration in our Kind cluster spec, with the kubeadmConfigPatches stanza.

Let’s recreate a cluster with kubeadm configuration patches to let our components bind on non loopback IP addresses and deploy kube-prometheus-stack Helm chart with default values:

kind delete clusterkind create cluster --image kindest/node:v1.23.1 --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
- |-
kind: ClusterConfiguration
# configure controller-manager bind address
controllerManager:
extraArgs:
bind-address: 0.0.0.0
# configure etcd metrics listen address
etcd:
local:
extraArgs:
listen-metrics-urls: http://0.0.0.0:
2381
# configure scheduler bind address
scheduler:
extraArgs:
bind-address: 0.0.0.0
- |-
kind: KubeProxyConfiguration
# configure proxy metrics bind address
metricsBindAddress: 0.0.0.0
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker
EOF
helm install --wait --timeout 15m \
--namespace monitoring --create-namespace \
--repo https://prometheus-community.github.io/helm-charts \
kube-prometheus-stack kube-prometheus-stack

Now, opening in a browser at http://localhost:9090/targets, we can see that etcd target is still not working:

At least we fixed the kube-scheduler, kube-controller-manager and kube-proxy target.

Configure kube-prometheus-stack

Expanding the etcd target line shows details on the error:

The issue here is that Prometheus tries to fetch metrics on port 2379 but we are listening on port 2381.

In order to tell kube-prometheus-stack to use port 2381 we need to configure it in the Helm values:

helm upgrade --install --wait --timeout 15m \
--namespace monitoring --create-namespace \
--repo https://prometheus-community.github.io/helm-charts \
kube-prometheus-stack kube-prometheus-stack --values - <<EOF
kubeEtcd:
service:
targetPort: 2381
EOF

Finally, browsing http://localhost:9090/targets should show all targets as OK 🎉

In the end, with a little bit of configuration we can have all Kubernetes components metrics available, allowing us to play with Prometheus in a local setup and test things before reaching production clusters.

--

--