Bozobooks.com: Fullstack k8s application blog series

Observability Metrics: Prometheus, Grafana, Alert Manager & Slack Notifications

Chapter 9: Set up the Metrics observability and configure the alert manager to send alerts to our slack channel when something goes wrong. Also, configure Grafana as our single pane of glass.

A B Vijay Kumar
Cloud Native Daily
Published in
8 min readNov 9, 2022

--

Observability is one of the very key aspects of distributed application architectures, as it's important for us to get real-time alerts on any failures that may happen in the clusters. It consists of visibility and monitoring of the following

  • Metrics ( We will be covering Metrics with Prometheus in this blog)
  • Logs (We will be covering Log aggregation with Loki in Chapter 10)
  • Traces (We will be covering traces with Zipkin in Chapter 11)

In chapter 12, we will also explore OpenTelemetry, which is turning out to be the new standard for Observability.

In this blog, we will set up the Prometheus stack to collect metrics from the nodes, PostgreSQL, Redis, and our quarkus application. We will be configuring Alert manager to send alerts to Slack and will build Grafana dashboards to track these metrics.

Let's get started with deploying the Prometheus stack in our cluster.

Install Prometheus Stack

We will use the Prometheus community helm charts to deploy the stack. Let's first add the helm repo and update it with the following commands

helm repo add prometheus-community https://prometheus-community.github.io/helm-chartshelm repo update

Before we deploy the Prometheus stack, let's first update our values file. (Please refer to this link for the complete values.yaml)

Alert Manager Configuration

We will be using the alert manager component of Prometheus stack to trigger alerts, and send them to our slack channel for notifications. Let’s first set up a slack channel and create a webhook for alerts. We will be using the same slack app that we created in Chapter 4. We will add another channel called “monitoring”, where we will be posting the alerts.

  • Goto the slack application that was created in Chapter 4
  • Configure new Incoming webhooks, by selecting the “Incoming Webhooks” Option
  • Click on “Add New Webhook to Workspace” and select “monitoring” channel. If the channel does not exist, please create one. The following is the screenshot, where you can copy the webhook.
Slack WebHook Configuration

You should get a message on the slack channel, that the webhook is integrated. The following screenshot shows the typical message

Slack message

Now let's configure the alert manager part of the values.yaml. The following shows the screenshot of configuring the slack channel for alerts.

It is self-explanatory, we are configuring the slack_api_url with the webhook, creating a receiver called, slack-notificationsand routing the alerts to the same. We are providing the channel name and the template to use to publish the message.

I had to put a name: 'null’receiver to overcome a bug (or is it a feature :-D ). if we don’t put that, the alert manager configurations were not getting updated when we install the helm chart. Hope this will be fixed soon.

Lets now configure grafana

Grafana Configuration

As part of the Promethues stack helm charts, we also get Grafana, we have to configure the volume for persistence, otherwise, every time, we restart Grafana, we will lose all our dashboard configurations. The following is the screenshot of the configuration

  • Line 20 — We are enabling Grafana so that the helm chart installs Grafana
  • Line 22–29 — We are setting up the persistence by creating a persistent volume claim
  • Line 31-39 — We are configuring the enabling default dashboards and data sources, which will configure “Prometheus” as the default data source

Lets now configure Promethues

Prometheus Configuration

Later in the blog, we will be configuring our microservices (quarkus applications) to also provide metrics. These microservices will expose the metrics on /q/metrics URL.

Let’s configure these endpoints on promethues. The following screenshot shows two targets configured, to scrape the application metrics

Let's also define basic rules. The following screenshot shows, standard Prometheus rules to alert when an Instance is down for more than a minute

This will trigger an alert when the instance is down for more than a minute, and the alert manager will then pick that alert, and notify over slack.

Install Prometheus stack

Now that we have the complete values.yaml (you can find it on my GitHub here). Let’s install the helm chart with the following command

helm upgrade -i  --create-namespace -n monitoring prometheus prometheus-community/kube-prometheus-stack --values ./values/kube-prometheus-stack-helm-values.yaml

This installs the complete stack, including the node exporter.

Bug: I found another bug while using docker desktop (which is not in Rancher desktop). The node exporter does not start, to fix that we have to execute the following command. thanks to jamesbright, I foudn this work around at https://github.com/prometheus-community/helm-charts/issues/467#issuecomment-957091174

kubectl patch ds -n monitoring prometheus-prometheus-node-exporter — type “json” -p ‘[{“op”: “remove”, “path” : “/spec/template/spec/containers/0/volumeMounts/2/mountPropagation”}]’

Let's now install Redis exporter and PostgresSQL exporter

Install Redis exporter

To collect metrics from Redis, we will have to deploy a Redis exporter. The following screenshot is the values.yaml for the prometheus-community/prometheus-redis-exporter

  • Line #2, we are providing the URL to our Redis instance.
  • Line #5 we are defining a label release: prometheus, This is an important label as Prometheus instance picks all the exporter that has this label, to scrape and collect metrics.
  • In lines 7–14 we are defining the service monitor, and once again the label release: prometheus, so that Prometheus can collect metrics

Let's now install the Redis Exporter with the following command

helm upgrade -i  --create-namespace -n monitoring redis-exporter prometheus-community/prometheus-redis-exporter --values ./values/redis-exporter-helm-values.yaml

Install Postgres exporter

To collect metrics from PostgresSQL, we will have to deploy a Postgres exporter. The following screenshot is the values.yaml for the prometheus-community/prometheus-postgres-exporter .

We can install the exporter with the following helm command.

Best Practice: The database password can be stored in a secret, or injected as a secret from vault. values.yaml allows us to use a kubernetes secret key, instead of providing the password directly.

helm upgrade -i  --create-namespace -n monitoring postgres-exporter prometheus-community/prometheus-postgres-exporter --values ./values/postgres-exporter-helm-values.yaml

Configure ServiceMonitor for Nginx

We will also configure our Nginx server to provide metrics to Promethues, This will provide the HTTP metrics. We will update the ingress-nginx ingress-nginx/ingress-nginx helm chart to also expose a ServiceMontior by making the following updates to the values.yaml.

The configuration is self-explanatory. The key thing to remember is to provide the label release: prometheus.

Let's upgrade the ingress deployment to take effect.

helm upgrade -i  --create-namespace -n ingress-nginx ingress-nginx ingress-nginx/ingress-nginx --values ./values/ingress-nginx-values.yaml

We should see the Nginx reflect in Promethues targets, as shown below

Now that we have all the service monitors configured, lets modify our microservices to expose the metrics.

Configure Quarkus application for monitoring

To let our Quarkus application expose the metrics on /q/metrics, we will use a quarkus-micrometer. To get it working, we will have to update the pom.xml with the following dependency

We then have to update our application.properties with the following code, to enable micrometer export

quarkus.micrometer.export.json.enabled=true

We can now import and use micrometer classes to share the metrics. The following code shows how to import and inject the micrometer library. Please refer to the Micrometer documentation, for more details

//import micrometer classes
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Tags;

//Service class code
@Inject
MeterRegistry registry;
//Inside the function we can define customer metrics with the //following code
registry.counter("searchByKeyword() Counter", Tags.of("keyword", query)).increment();

The following screenshot shows the updated BookInfoService code

Once we build and deploy this code, we should be able to see the metrics at /q/metrics path. The following is the screenshot of bookinfo, after making the changes.

Now we can configure dashboards on Grafana, by accessing the Grafana service (port-forward). The following are some screenshots of the dashboards.

Alert Manager Dashboard
Compute Resouces Dashboard
Pod Dashboard

The following is the screenshot of alerts that are notified on slack.

Alert notifications on Slack

That's all for now, in the next chapters, we will be looking at configuring Loki for logs and Zipkin for Traces. This will provide complete observability of our cluster.

I hope this was useful, please leave your feedback and comments

Take care…have fun ;-)

Further Reading:

--

--

A B Vijay Kumar
Cloud Native Daily

IBM Fellow, Master Inventor, Mobile, RPi & Cloud Architect & Full-Stack Programmer