Prometheus on K8s

Comprehensive Beginner’s Guide to Kube-Prometheus in Kubernetes: Monitoring, Alerts, & Integration

Joud W. Awad
18 min readNov 14, 2023

--

In this article, we will explore how Prometheus functions, the steps involved in setting it up, and the process of monitoring your pods and services. Additionally, we will delve into configuring alert notifications to Slack using AlertManager.

Prometheus Architecture

Prometheus adopts a unique pull-based model, periodically scraping metrics from target systems. This approach enables monitoring across a diverse range of applications, services, and infrastructure components. Below is a diagram illustrating the interaction of Prometheus components.

Prometheus Architecture

but let us break down things to understand how things will be working:

  1. Prometheus Server: This core component handles data collection, storage, querying, and metrics processing. It operates on a multi-dimensional data model, collecting metrics at specified intervals, evaluating rule expressions, and potentially triggering alerts.
  2. Data Storage: Utilizing a time-series database (TSDB), Prometheus stores metrics efficiently. This TSDB organizes data into blocks containing compressed time-series information, optimized for fast-read operations and historical data access. Its data retention system allows setting specific data retention time frames and expiration policies.
  3. Alertmanager: Managing alerts from the Prometheus server, Alertmanager handles routing, silencing, and aggregation of notifications, supporting various channels like email, Slack, PagerDuty, and OpsGenie.
  4. Client Libraries: These are used to instrument application code. by exposing a URL, Prometheus will read that URL and fetch the metrics from it. Prometheus provides client libraries for several languages to instrument code and expose metrics.
  5. Push Gateway: To accommodate jobs unsuitable for scraping, like batch jobs, the Push Gateway serves as an intermediary. Batch/short-lived jobs push metrics to this gateway, from which Prometheus retrieves the data.
  6. Exporters: For services that don’t provide Prometheus metrics out of the box, exporters can be used. These are agents that translate metrics from third-party systems into Prometheus format. Common examples include the Node Exporter for hardware and OS metrics, and the MySQL Exporter for database metrics.
  7. Service Discovery: Prometheus dynamically discovers scrape targets through systems like Kubernetes, AWS EC2, etc. This feature is particularly useful in Kubernetes environments, enabling Prometheus to identify endpoints for metric scraping.
  8. PromQL (Prometheus Query Langauge): This powerful query language allows real-time selection and aggregation of time series data. PromQL facilitates complex monitoring queries, such as calculating averages, quantiles, and predictions, and is used by third-party services like Grafana and the Prometheus UI for data visualization

The vast architecture of Prometheus may seem daunting at first glance. Setting up and integrating all its components can be a time-consuming yet rewarding task. This leads us to the game-changing aspect of Kube-Prometheus, which we will explore next.

what is Kube-Prometheus?

Kube-Prometheus is an open-source project designed to simplify the deployment and management of Prometheus within a Kubernetes environment. It is a toolkit that seamlessly integrates Prometheus with Kubernetes, offering a comprehensive monitoring solution.

Key Components of Kube-Prometheus

  1. Prometheus Operator: Automates the management of Prometheus and Alertmanager instances, easing deployment and configuration in Kubernetes.
  2. Prometheus: Central to collecting and storing metrics, it scrapes time-series data based on defined metrics from Kubernetes services.
  3. Alertmanager: Manages and routes alerts from Prometheus, supporting complex rules and multiple notification channels.
  4. Node Exporter: Collects hardware and OS metrics from cluster nodes.
  5. Kube State Metrics: Generates metrics about the state of various Kubernetes objects.
  6. Grafana: Provides visualization capabilities, integrating with Prometheus for detailed monitoring dashboards.
  7. ServiceMonitor and PodMonitor: Custom Resource Definitions (CRDs) for monitoring service or pod groups.

Installation Using Helm

Assuming you have Minikube installed locally or a cloud-based Kubernetes cluster like EKS or GKE, we’ll use Helm for installation. Follow these steps:

  1. Add the Prometheus community repository:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

2. Create a monitoring namespace and install Kube-Prometheus

kubectl create namespace monitoring
helm install kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring

After waiting approximately 5–10 minutes, verify the installation:

kubectl get pods -n monitoring

and you should see a similar output to this

If the output matches the expected results, congratulations, you’ve successfully installed Kube-Prometheus! 🙌

Now that Kube-Prometheus is installed, the next steps involve visualizing services, exposing them to localhost, troubleshooting any issues, and setting up Slack notifications with Alertmanager. Let’s dive into these aspects in the following sections.

Exposing the Prometheus UI

After installing Kube-Prometheus, various services are set up in your cluster. You can view these by running:

kubectl get svc -n monitoring

and the result should be similar to this one

To access the Prometheus UI, we’ll expose it to localhost. Since it’s a ClusterIP-type service and not directly accessible via URL, we use port forwarding:

kubectl port-forward svc/kube-prometheus-kube-prome-prometheus -n monitoring 9090

Now, visit http://localhost:9090 to access the Prometheus UI.

Prometheus Graph Dashboard

Within the UI, you’ll find:

  • Graph Section: Allows running PromQL queries and interacting with stored data.
  • Alerts Section: Shows all system-detected alerts. Navigate to the Alerts tab to explore further.
Prometheus Alerts Dashboard

this is basically different if you are running this on a Minikube or EKS, etc…
however, the main concept that we need to understand now is that Alerts have three states

  1. Inactive: This is the default state of an alert. It indicates that the alert condition is not met, and there is currently no action required. In this state, the system is considered to be functioning normally with respect to the conditions defined in this alert.
  2. Pending: When an alert condition is met, but for a duration less than the configured threshold, the alert moves to the ‘pending’ state. This state is essentially a waiting period, allowing for temporary spikes or brief issues to resolve without triggering an actual alert. If the condition persists beyond the defined threshold, the alert state changes to ‘firing’.
  3. Firing: This state indicates that the alert condition has been met for a duration longer than the configured threshold. In the ‘firing’ state, the alert is actively signaling an issue that needs attention. Prometheus will then send notifications to the Alertmanager, which in turn routes these notifications to the configured notification channels (like email, Slack, etc.).

You should see only one alert there which is the Watchdog alert.
The Watchdog Alert is a default alert for testing the system. It can be ignored, silenced, or removed as needed, this is fired by the Prometheus in order for us to test and know that the alerting system is working. We’ll dive deeper into alert management and Slack notifications in upcoming sections.

Notes For EKS, GKE, and Other Managed Control Plane Users:

Users on EKS, GKE, or similar services might encounter additional alerts KubeSchedulerDown and KubeControllerManagerDown. These are common because the control plane nodes in managed services aren’t visible to Prometheus. To address this:
1. Create a values.yaml file with the following content:

kubeScheduler:
enabled: false
kubeControllerManager:
enabled: false

2. Update your installation using:

helm upgrade -f values.yaml kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring

Or incorporate these settings during initial installation:

helm install -f values.yaml kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring

After applying these changes and waiting a few minutes, the additional alerts should resolve. 😸

Monitoring Pod and Services in Prometheus

So far we have set up the alerting system, viewed it is component, and were able to understand how alerts work.
Let us dive deeper into understanding how to create a pod that exposes a monitoring endpoint and allows Prometheus to monitor that pod or even monitor the service

in order to do that I will create a simple Nodejs application that uses the Prometheus client library and exposes metrics at the /metrics endpoint, then I will tell Prometheus to monitor that Pod for me to extract data from the /metrics endpoint

Creating our first application with Prometheus Metrics (Optional):

the following are the required codes for our Nodejs application (this section is optional I am just trying to show you how to set custom metrics and expose metrics in your application)

const express = require("express");
const client = require("prom-client");

// Create an Express app
const app = express();
// Create a Registry to register the metrics
const register = new client.Registry();

// The metrics we would like to collect:
// 1. A gauge for the current number of active requests
const activeRequests = new client.Gauge({
name: "active_requests",
help: "Number of active requests",
labelNames: ["method", "endpoint"],
});

// 2. A counter for the total number of requests
const totalRequests = new client.Counter({
name: "app_total_requests",
help: "Total number of requests",
labelNames: ["method", "endpoint", "status"],
});

// Register the metrics
register.registerMetric(activeRequests);
register.registerMetric(totalRequests);

client.collectDefaultMetrics({ register });

// Add a middleware that increases activeRequests on request and decreases on response
app.use((req, res, next) => {
// Increment with labels
activeRequests.inc({
method: req.method,
endpoint: req.path,
});

totalRequests.inc({
method: req.method,
endpoint: req.path,
});

res.on("finish", () => {
// Decrement with labels for active requests
activeRequests.dec({
method: req.method,
endpoint: req.path,
});

});

next();
});

// Define a route to expose the metrics
app.get("/metrics", async (req, res) => {
// Allow Prometheus to scrape the metrics
res.set("Content-Type", register.contentType);
res.end(await register.metrics());
});

// Define a simple route to simulate app behavior
app.get("/", (req, res) => {
res.send("Welcome to the Home page of package 1!");
});

// Start the Express server
const port = 3001;
app.listen(port, () => {
console.log(`Server listening at http://localhost:${port}`);
});

you can either copy past this code and create your own docker image or later on I will provide my own image that you can use to spin up your Pod

let us break down the code in here
const client = require(“prom-client”);
in this line, we imported the Prometheus client library for NodeJS applications
then we defined two types of metrics that we wanted to use

// 1. A gauge for the current number of active requests
const activeRequests = new client.Gauge({
name: "active_requests",
help: "Number of active requests",
labelNames: ["method", "endpoint"],
});

// 2. A counter for the total number of requests
const totalRequests = new client.Counter({
name: "app_total_requests",
help: "Total number of requests",
labelNames: ["method", "endpoint", "status"],
});

we define two types of custom metrics
- Gauge which means that this kind of metric value can go up and down the best example of this type is the current active requests on the site,

- Counter is the second type of custom metric we want to use this type basically just keeps it is values increasing and what would be a better example of this than the total requests of the app
after defining our custom metrics we need to register them with the Prometheus client library and that is what we do here

// Register the metrics
register.registerMetric(activeRequests);
register.registerMetric(totalRequests);
client.collectDefaultMetrics({ register });

In the last piece of code, we have Controlled the value of custom metrics, incrementing and decrementing based on requests.

// Add a middleware that increases activeRequests on request and decreases on response
app.use((req, res, next) => {
// Increment with labels
activeRequests.inc({
method: req.method,
endpoint: req.path,
});

totalRequests.inc({
method: req.method,
endpoint: req.path,
});

res.on("finish", () => {
// Decrement with labels for active requests
activeRequests.dec({
method: req.method,
endpoint: req.path,
});
});

next();
});

basically, in each request that happens, we increase the number of activeRequests and also the totalRequests and when a request is finished we basically just decrease the number of activeRequests
and this is basically how we create custom metrics, finally, the most important part here is this

// Define a route to expose the metrics
app.get("/metrics", async (req, res) => {
// Allow Prometheus to scrape the metrics
res.set("Content-Type", register.contentType);
res.end(await register.metrics());
});

this is a very important Endpoint that we need to expose in order for Prometheus to scrape metrics from the application, let us start our applications locally by running the following commands

npm init -y
npm install express prom-client
node app.js # assuming you have named the above file app.js

now this should start the server on port 3001, so if we visited http://localhost:3001/ we should see the following message

Application Starting Interface

simple but very effective huh? Let us specie things a little bit and visit the URL http://localhost:3001/metrics

Metrics URL

Congratulations, you’ve now set up Prometheus metrics and custom metrics for your application!

Setup the monitoring for Pods

In this section, we’ll set up a pod and service in Kubernetes to make our application, which exposes Prometheus metrics, reachable within the cluster.

apiVersion: apps/v1
kind: Deployment
metadata:
name: app-1
spec:
replicas: 2
selector:
matchLabels:
app: app-1
template:
metadata:
labels:
app: app-1
monitoring: enabled # this is very important to have and will be discuessed later
spec:
containers:
- name: app-1
image: thejoud1997/app-1:latest
ports:
- containerPort: 3000
name: http-metrics
env:
- name: PORT
value: "3000"
resources:
requests:
memory: 128Mi
cpu: 50m
limits:
memory: 128Mi
cpu: 50m
---
apiVersion: v1
kind: Service
metadata:
name: app-1-service
labels:
monitoring: enabled # this is very important to have and will be discuessed later
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: http-metrics
name: http-metrics
selector:
app: app-1

Key Points in the Configuration:

  • Deployment: Creates 2 replicas of the app-1 Docker image, which includes our custom metrics.
  • Service: Exposes the deployment through a Load Balancer. The monitoring: enabled label is crucial for Prometheus to identify and monitor this service.

Execute the following commands to apply the YAML configuration

kubectl apply -f app.yaml
kubectl get pods
kubectl get svc

You should see two running pods and a service. For Minikube users, you’ll need to run minikube tunnel to access the service via a local URL.

Querying Custom Resources in Prometheus

Even with the setup complete, Prometheus might not immediately show data from our new deployment. This is because we need to inform Prometheus about which pods and services to monitor. If you try running the same query I am trying in the Prometheus UI and don’t see results, it’s not necessarily an issue with the setup.

Let’s investigate how Prometheus discovers which pods and services to monitor. This involves understanding service discovery and configuration within Prometheus, which we’ll explore in the next section.

Introduction PodMonitor & ServiceMonitor

With Kube-Prometheus, additional Custom Resource Definitions (CRDs) are installed, namely PodMonitor and ServiceMonitor. These CRDs inform the Prometheus operator which pods or services to monitor by scraping metrics.

How PodMonitor & ServiceMonitor Works

These CRDs enable you to specify pods or services for monitoring based on certain labels within a defined namespace. Prometheus then picks up these configurations and begins monitoring.

Let’s create a new PodMonitor with the following YAML configuration:

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: my-podmonitor
namespace: default
spec:
podMetricsEndpoints:
- path: metrics
port: http-metrics
namespaceSelector:
any: true
selector:
matchLabels:
monitoring: enabled

let us explain this one really quickly, this defines a custom resource called my-podmonitor this is related to the Prometheus and it uses 3 main selects:

  1. namespaceSelector: Specifies the namespace from which to scrape pods.
  2. selector.matchLabels: Defines the labels required on a pod for it to be monitored.
  3. podMetricsEndpoints: Indicates the port and path where metrics are available.

By applying this configuration, we instruct Prometheus on where and how to scrape metrics from our pods, to do this let us run the following command

kubectl apply -f monitoring.yaml

Updating Helm Installation

To ensure Prometheus correctly reads the PodMonitor & ServiceMonitor, we need to update our Helm installation. This involves setting specific values to make our configuration universally applicable. Create a values.yaml file with the following content:

prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false

then run the following command

helm upgrade -f values.yaml kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring

After applying this change, wait a few minutes for Prometheus to recognize the new settings.

Now, you should be able to run queries in the Prometheus UI and see the results from your monitored pods.

Prometheus PromQL Results

perfect now we have set a full monitoring with our Pods, to monitor services, the process is similar to PodMonitor. Use the ServiceMonitor CRD, following the steps outlined above, to monitor services based on specific labels and configurations.

Defining Rules and Sending Alerts to Slack With Alert Manager

we have already seen how Prometheus works, how it defines alerts, and how it queries data, now it is time for us to understand how to create our own rules and how to set our own alerting configuration to send alerts to third-party tools like Slack, Email, etc…

An alert in Prometheus is based on a PromQL query with defined conditions. When these conditions are met, the alert transitions through different states: inactive, pending, and then firing. At the firing stage, actions like sending alerts to Slack can be initiated.

Kube-Prometheus includes the PrometheusRule CRD for defining custom rules. Here's how to set up our first rule:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
release: kube-prometheus
name: app-1-rule
namespace: monitoring
spec:
groups:
- name: "app-1.rules"
rules:
- alert: PodDown
for: 1m
expr: sum(up{container="app-1"}) < 2
labels:
severity: critical
app: app-1
annotations:
# these annotations are going to be used in the alert that will be send to the slack/email/etc...
message: The deployment has less than 1 pod running.
summary: the current deployment "app-1" need to have at least 2 replicas running
description: there seems to be a mismatch in the number of pod replicas between your alert configuration and the pod replicas

seem a little hard? Let us break it down:

  • Metadata: Custom metadata for the rule.
  • Groups: Organize rules into manageable groups.
  • Alert Name: Visible in the Prometheus UI.
  • For: Duration the alert stays in the pending state before firing.
  • Expr: PromQL expression to evaluate the alert condition.
  • Labels: are used mainly for filtering on alerting so we can use those labels to know to which destination we need to direct this message, an example would be:
    - if the alert contains a label with severity=critical direct it to the DevOps team lead
    - if the alert contains a label with severity=warning direct it to the Senior DevOps member
  • Annotations: are metadata that we pass to the alert, which can easily be messages and text to include in the alert itself, we can use dynamic values to return the pod name or pod labels, etc…

so with all that in mind let us just create this rule and check it in our Prometheus UI, i have put this code in a file called rule.yaml

kubectl apply -f rule.yaml

Wait a few minutes, then check the Prometheus UI under the Alerts tab., http://localhost:9090 and then navigate to the Alerts tab

Custom Rule in Prometheus

we did it!! we have set up our first rule in Prometheus how great is that?
let us test this rule by reducing the number of pods running in our deployment just run the following command

kubectl scale deployments.apps app-1 --replicas=1

This should trigger the alert, moving it to the pending state, and then to firing.
after a few minutes, you should see your alert in the firing state!!

Custom Rule Alerting

With the alert rule working, the next step is to configure notifications to Slack. This involves setting up the Alertmanager to send alerts to a designated Slack channel, but first, let us return the right number of running pods in the deployment

kubectl scale deployments.apps app-1 --replicas=2

Setting up Slack

To receive alerts on Slack, we first need to configure our Slack channel and install the webhook app integration.

Navigate to the bottom left of Slack and click on “+ Add Apps”.
Search for “Incoming Webhooks” and add the app

click on the Add button and you will be redirected to a new page On that page make sure to click on the Green Add to Slack button again

then you will have to select a channel to install this integration on, i have already created a channel and named it alerts

now we click on the Add incoming webhook integrations once we do this I get redirected to a new page where I can grab my Slack API URL

copy this link and make sure to save it someplace safe so you do not lose it, and if you go back to the slack channel that you set up the webhook integration on you should also see this

this means that you have successfully installed that app on this channel, we are now on the last step in our Tutorial, sending the notifications to this channel which is indeed a very easy thing to do, but let us break it down

first, we need to understand that to set up an alert notification to Prometheus there are 3 ways to do it, all can be found in the documentation, but in a nutshell, we can do one of these three ways:

  1. You can use a native Alertmanager configuration file stored in a Kubernetes secret.
  2. You can use spec.alertmanagerConfiguration to reference an AlertmanagerConfig object in the same namespace which defines the main Alertmanager configuration.
  3. You can define spec.alertmanagerConfigSelector and spec.alertmanagerConfigNamespaceSelector to tell the operator which AlertmanagerConfigs objects should be selected and merged with the main Alertmanager configuration.

I am going to do a hybrid solution between configuring the Helm installation and defining my own AlertmanagerConfig so let us start.

Modifying Helm Installation

I will start by modifying my own Helm installation so we configure global options to set the slack URL in it and also do some more configuration that is related to how we are going to manage the CRD AlertManagerConfig

for that my values.yaml that is used for Helm customization would looks like this

prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false

alertmanager:
config:
global:
slack_api_url: <SLACK_URL_YOU_HAVE_COPIED_FORM_BEFORE>
alertmanagerSpec:
alertmanagerConfigNamespaceSelector:
alertmanagerConfigSelector:
alertmanagerConfigMatcherStrategy:
type: None

in a nutshell, we define a global slack URL to use in our configuration so I do not need to specify the secret each time, more over i am also defining the AlertManagerConfig selection telling my Prometheus to take a look into all namespaces and all the CRDs of type AlertManagerConfig

now let us apply the new changes to our helm installation

helm upgrade -f values.yaml kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring

once this is done we are good to go with FINALLY defining our own alert manager configurations.

and to do this, when Kube-Prometheus is installed it adds a CRD named AlertManagerConfig that allows us to define our own conditions and routing in the system without the need to edit the YAML installation or secret file each time we want to add a new configuration, this is very powerful when we are working in big teams.
so here comes my Alertmanager configuration:

apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: global-alert-manager-configuration
namespace: monitoring
labels:
release: kube-prometheus
spec:
receivers:
- name: slack-receiver
slackConfigs:
- channel: "#alerts" # your channel name that you have created
sendResolved: true
iconEmoji: ":bell:"
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}\nmessage: {{ .CommonAnnotations.message }}"
route:
matchers:
# name of the label to match
- name: app
value: app-1
matchType: =
groupBy: ["job", "severity"]
groupWait: 30s
receiver: slack-receiver
groupInterval: 1m
repeatInterval: 1m

Breaking Down the Configuration:

  • Receivers: Define how notifications are sent. Each receiver can have it is own way of getting notifications, maybe one receiver would like to receive notifications via Slack, another by email, and others may want to get notifications on both email and Slack.
  • Route: Specifies conditions for sending notifications to the receiver. Includes settings like groupWait, groupBy, and repeatInterval.
  • Matchers: Filters alerts based on labels defined in PrometheusRule in this example, we are looking into one label, the label called app have to have a value that matches app-1 which is exactly what we set in the PrometheusRule and now this is going to match and fire this receiver message
matchers:
- name: app
value: app-1
matchType: =

Testing the alert notification setup

Once set up, you can test the alerting by triggering conditions that match our PrometheusRule. This should result in notifications being sent to your configured Slack channel.

so first let us just decrease the number of replicas to one so that PrometheusRule is triggered, and then wait for few minutes and check the Prometheus UI again

kubectl scale deployments.apps app-1 --replicas=1
Prometheus Rule Alerting

but this time if we go to the Slack channel we will see something

Slack Alert

yayyy!! we finally got our alert and it is working

Conclusion

We have journeyed through an extensive and enlightening tutorial, aiming to demystify the workings of Prometheus in a Kubernetes environment. This article was designed to unravel the complexities and provide a clear understanding of both basic and advanced aspects of monitoring and alerting in Kubernetes clusters.

From setting up Prometheus and understanding its architecture to creating custom metrics, defining alert rules, and integrating notifications with Slack, we’ve covered a broad spectrum of functionalities. The goal was to equip you with the knowledge and skills necessary to confidently monitor and manage your Kubernetes clusters.

As you delve into the world of Kubernetes and Prometheus, remember that this is just the beginning. The landscape is vast, and there’s much more to explore and learn. Happy monitoring!

Follow Me For More Content

If you made it this far and you want to receive more content similar to this make sure to follow me on Medium and on Linkedin

--

--

Joud W. Awad

Experienced Software Engineer, Solutions Architect with 9+ years in backend/frontend development, mobile apps, and DevOps